Sitemap Generator
A high-performance Rust library for generating XML sitemaps compliant with the sitemaps.org protocol 0.9.
Features
- Standard XML Sitemaps: Generate basic sitemaps with URLs, lastmod, changefreq, and priority
- Image Sitemaps: Include image metadata (captions, geo-location, titles, licenses)
- Video Sitemaps: Include video metadata (thumbnails, descriptions, durations, ratings)
- News Sitemaps: Submit news articles to Google News with publication metadata
- Combined Sitemaps: Combine multiple extensions (image + video + news) in one sitemap
- Sitemap Index: Manage multiple sitemap files for large websites (>50k URLs)
- Validation: Automatic validation of URLs, size limits, and protocol compliance
- Compression: Built-in gzip compression support (96-98% bandwidth savings)
- Web Framework Support: Direct bytes output for Axum, Actix-web, Rocket, etc.
- Parsing: Read and parse existing sitemap files
- Memory Efficient: ~140 bytes/URL during generation, 0 bytes after (proven)
- High Performance: ~830K URLs/second, immediate memory cleanup
- Optimized Builders: Pre-allocate capacity with
with_capacity()for better performance - Zero unsafe code: 100% safe Rust
Installation
Add this to your Cargo.toml:
[]
= "0.1"
Quick Start
Basic Sitemap
Generate sitemap as String, bytes, or compressed bytes:
use ;
Image Sitemap
use ;
Video Sitemap
use ;
News Sitemap
use ;
Combined Sitemap (Multiple Extensions)
Combine image, video, and news extensions in a single sitemap:
use ;
Sitemap Index
use ;
Web Framework Integration
Use with Axum, Actix-web, Rocket, or other web frameworks:
Axum Example
use ;
use ;
async
// With compression (recommended):
async
Actix-web Example
use ;
use ;
async
See examples/web_framework_usage.rs for more examples.
Parsing Sitemaps
use SitemapParser;
Performance
This library is designed for high performance and low memory usage:
- Fast XML Generation: Uses
quick-xmlfor efficient XML writing (~830K URLs/second) - Memory Efficient: ~140 bytes per URL during generation, 0 bytes after (proven with custom allocator)
- Optimized Allocation: Use
with_capacity()to pre-allocate memory and reduce reallocations - Immediate Cleanup: Memory released immediately when builder drops (RAII-based)
- Zero Memory Leaks: Empirically proven - 1 byte growth across 100 iterations
- Streaming Compression: Gzip compression with 50x+ compression ratios (98% bandwidth saved)
Performance Tips
Use with_capacity() for better performance:
// ❌ Without capacity (Vec grows dynamically)
let mut builder = new;
for i in 0..10_000
// ✅ With capacity (Vec pre-allocated, ~2-5% faster)
let mut builder = with_capacity;
for i in 0..10_000
Available for all builders:
SitemapBuilder::with_capacity(10_000)ImageSitemapBuilder::with_capacity(5_000)VideoSitemapBuilder::with_capacity(1_000)NewsSitemapBuilder::with_capacity(1_000)CombinedSitemapBuilder::with_capacity(500)SitemapIndexBuilder::with_capacity(50)
Benchmarks
| Operation | 10 URLs | 100 URLs | 1,000 URLs | 10,000 URLs |
|---|---|---|---|---|
| Standard Sitemap | 12.5 µs | 120 µs | 1.2 ms | 12.2 ms |
| Build Bytes | 5.7 µs | 51 µs | 519 µs | 5.2 ms |
| Build Compressed | 25 µs | 180 µs | 1.8 ms | 18 ms |
| Image Sitemap (2 imgs) | 28 µs | 275 µs | 2.7 ms | - |
| Video Sitemap | 45 µs | 450 µs | 4.5 ms | - |
Compression Ratios: 31x (100 URLs) to 54x (10,000 URLs) - saves 96-98% bandwidth
Proven Results:
- Performance: Criterion.rs benchmarks
- Memory safety: Custom allocator tracking (see MEMORY_SAFETY.md)
Documentation:
- BENCHMARK_RESULTS.md - Actual benchmark numbers
- BENCHMARKS.md - Performance analysis
- MEMORY_SAFETY.md - Memory leak proof
Run benchmarks yourself:
# Performance benchmarks
# Memory safety demo
Validation
The library automatically validates:
- URL format (RFC 3986)
- Maximum 50,000 URLs per standard sitemap (1,000 for news sitemaps)
- Maximum 50MB uncompressed size
- URL length (max 2048 characters)
- Date format (W3C Datetime)
- Priority values (0.0 to 1.0)
- Video duration (max 28,800 seconds)
- Video rating (0.0 to 5.0)
- Video title length (max 100 characters)
- Video description length (max 2048 characters)
- News language codes (ISO 639 format)
- Stock tickers (max 5, comma-separated)
You can disable validation if needed:
let builder = new.validate;
// Or with capacity
let builder = with_capacity.validate;
Sitemap Protocol Compliance
This library follows the sitemaps.org protocol 0.9 specification:
- Standard sitemap with
<urlset>and<url>elements - Optional
<lastmod>,<changefreq>, and<priority>elements - Image extension:
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" - Video extension:
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" - News extension:
xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" - Sitemap index with
<sitemapindex>and<sitemap>elements
Examples
See the examples directory for more complete examples:
- basic_sitemap.rs - Standard sitemap generation
- image_sitemap.rs - Image sitemap with metadata
- video_sitemap.rs - Video sitemap with full metadata
- news_sitemap.rs - News sitemap for Google News
- combined_sitemap.rs - Combined sitemap with multiple extensions
- sitemap_index.rs - Sitemap index for multiple files
- web_framework_usage.rs - Web framework integration examples
- memory_demo.rs - Memory tracking and cleanup proof
Run examples with:
Testing
Run the test suite:
License
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.