Sitemap Generator

A high-performance Rust library for generating XML sitemaps compliant with the sitemaps.org protocol 0.9.
Features
- Standard XML Sitemaps: Generate basic sitemaps with URLs, lastmod, changefreq, and priority
- Image Sitemaps: Include image metadata (captions, geo-location, titles, licenses)
- Video Sitemaps: Include video metadata (thumbnails, descriptions, durations, ratings)
- News Sitemaps: Submit news articles to Google News with publication metadata
- Combined Sitemaps: Combine multiple extensions (image + video + news) in one sitemap
- Sitemap Index: Manage multiple sitemap files for large websites (>50k URLs)
- Validation: Automatic validation of URLs, size limits, and protocol compliance
- Compression: Built-in gzip compression support (96-98% bandwidth savings)
- Web Framework Support: Direct bytes output for Axum, Actix-web, Rocket, etc.
- Parsing: Read and parse existing sitemap files
- Memory Efficient: ~140 bytes/URL during generation, 0 bytes after (proven)
- High Performance: ~830K URLs/second, immediate memory cleanup
- Zero unsafe code: 100% safe Rust
Installation
Add this to your Cargo.toml:
[dependencies]
sitemap_generator = "0.1"
Quick Start
Basic Sitemap
Generate sitemap as String, bytes, or compressed bytes:
use sitemap_generator::{SitemapBuilder, UrlEntry, ChangeFreq};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut builder = SitemapBuilder::new();
builder.add_url(UrlEntry::new("https://example.com/")
.lastmod("2025-11-01")
.changefreq(ChangeFreq::Daily)
.priority(1.0));
builder.add_url(UrlEntry::new("https://example.com/page1")
.lastmod("2025-11-02")
.priority(0.8));
let xml = builder.build()?;
let bytes = builder.build_bytes()?;
let compressed = builder.build_compressed_bytes()?;
builder.write("sitemap.xml")?;
builder.write_compressed("sitemap.xml.gz")?;
Ok(())
}
Image Sitemap
use sitemap_generator::{ImageSitemapBuilder, UrlEntry, UrlWithImages, ImageEntry};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut builder = ImageSitemapBuilder::new();
let url_with_images = UrlWithImages::new(
UrlEntry::new("https://example.com/gallery")
.lastmod("2025-11-01")
)
.add_image(
ImageEntry::new("https://example.com/images/photo1.jpg")
.title("Beautiful Sunset")
.caption("A stunning sunset over the ocean")
);
builder.add_url(url_with_images);
let xml = builder.build()?;
builder.write("image_sitemap.xml")?;
Ok(())
}
Video Sitemap
use sitemap_generator::{VideoSitemapBuilder, UrlEntry, UrlWithVideos, VideoEntry};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut builder = VideoSitemapBuilder::new();
let url_with_video = UrlWithVideos::new(
UrlEntry::new("https://example.com/videos/intro")
)
.add_video(
VideoEntry::new(
"https://example.com/thumbnails/intro.jpg",
"Introduction to Our Product",
"Learn about our product in this video"
)
.content_loc("https://example.com/videos/intro.mp4")
.duration(300)
.rating(4.5)
.family_friendly(true)
.add_tag("tutorial")
);
builder.add_url(url_with_video);
let xml = builder.build()?;
builder.write("video_sitemap.xml")?;
Ok(())
}
News Sitemap
use sitemap_generator::{NewsSitemapBuilder, UrlEntry, UrlWithNews, NewsEntry, NewsPublication};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut builder = NewsSitemapBuilder::new();
let publication = NewsPublication::new("TechDaily", "en");
let news = NewsEntry::new(
publication,
"2025-11-01T10:00:00Z",
"Revolutionary AI Breakthrough Announced"
)
.keywords("AI, technology, machine learning")
.stock_tickers("GOOGL, MSFT");
let url = UrlEntry::new("https://example.com/tech/ai-breakthrough");
builder.add_url(UrlWithNews::new(url, news));
let xml = builder.build()?;
builder.write("news_sitemap.xml")?;
Ok(())
}
Combined Sitemap (Multiple Extensions)
Combine image, video, and news extensions in a single sitemap:
use sitemap_generator::{
CombinedSitemapBuilder, UrlEntry, UrlWithExtensions,
ImageEntry, VideoEntry, NewsEntry, NewsPublication
};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut builder = CombinedSitemapBuilder::new();
let news = NewsEntry::new(
NewsPublication::new("TechDaily", "en"),
"2025-11-01T10:00:00Z",
"AI Breakthrough Announced"
);
let image = ImageEntry::new("https://example.com/ai-lab.jpg")
.title("AI Research Lab");
let video = VideoEntry::new(
"https://example.com/thumb.jpg",
"AI Demo Video",
"Watch the demonstration"
)
.content_loc("https://example.com/video.mp4")
.duration(300);
builder.add_url(
UrlWithExtensions::new(UrlEntry::new("https://example.com/article"))
.add_image(image)
.add_video(video)
.set_news(news)
);
let xml = builder.build()?;
builder.write("combined_sitemap.xml")?;
Ok(())
}
Sitemap Index
use sitemap_generator::{SitemapIndexBuilder, SitemapIndexEntry};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut builder = SitemapIndexBuilder::new();
builder.add_sitemap(
SitemapIndexEntry::new("https://example.com/sitemap1.xml.gz")
.lastmod("2025-11-01")
);
builder.add_sitemap(
SitemapIndexEntry::new("https://example.com/sitemap2.xml.gz")
.lastmod("2025-11-02")
);
let xml = builder.build()?;
builder.write("sitemap_index.xml")?;
Ok(())
}
Web Framework Integration
Use with Axum, Actix-web, Rocket, or other web frameworks:
Axum Example
use axum::{
response::{Response, IntoResponse},
http::{StatusCode, header},
};
use sitemap_generator::{SitemapBuilder, UrlEntry};
async fn sitemap() -> impl IntoResponse {
let mut builder = SitemapBuilder::new();
builder.add_url(UrlEntry::new("https://example.com/"));
match builder.build_bytes() {
Ok(bytes) => Response::builder()
.status(StatusCode::OK)
.header(header::CONTENT_TYPE, "application/xml; charset=utf-8")
.body(bytes.into())
.unwrap(),
Err(_) => Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)
.body(Vec::new().into())
.unwrap(),
}
}
async fn sitemap_compressed() -> impl IntoResponse {
let mut builder = SitemapBuilder::new();
builder.add_url(UrlEntry::new("https://example.com/"));
match builder.build_compressed_bytes() {
Ok(bytes) => Response::builder()
.status(StatusCode::OK)
.header(header::CONTENT_TYPE, "application/xml")
.header(header::CONTENT_ENCODING, "gzip")
.body(bytes.into())
.unwrap(),
Err(_) => Response::builder()
.status(StatusCode::INTERNAL_SERVER_ERROR)
.body(Vec::new().into())
.unwrap(),
}
}
Actix-web Example
use actix_web::{web, HttpResponse, Result};
use sitemap_generator::{SitemapBuilder, UrlEntry};
async fn sitemap() -> Result<HttpResponse> {
let mut builder = SitemapBuilder::new();
builder.add_url(UrlEntry::new("https://example.com/"));
match builder.build_compressed_bytes() {
Ok(bytes) => Ok(HttpResponse::Ok()
.content_type("application/xml")
.insert_header(("Content-Encoding", "gzip"))
.body(bytes)),
Err(_) => Ok(HttpResponse::InternalServerError().finish()),
}
}
See examples/web_framework_usage.rs for more examples.
Parsing Sitemaps
use sitemap_generator::SitemapParser;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let entries = SitemapParser::parse_file("sitemap.xml")?;
let entries = SitemapParser::parse_compressed("sitemap.xml.gz")?;
let xml = r#"<?xml version="1.0"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2025-11-01</lastmod>
</url>
</urlset>"#;
let entries = SitemapParser::parse_string(xml)?;
for entry in entries {
println!("URL: {}", entry.loc);
}
Ok(())
}
Performance
This library is designed for high performance and low memory usage:
- Fast XML Generation: Uses
quick-xml for efficient XML writing (~830K URLs/second)
- Memory Efficient: ~140 bytes per URL during generation, 0 bytes after (proven with custom allocator)
- Immediate Cleanup: Memory released immediately when builder drops (RAII-based)
- Zero Memory Leaks: Empirically proven - 1 byte growth across 100 iterations
- Streaming Compression: Gzip compression with 50x+ compression ratios (98% bandwidth saved)
Benchmarks
| Operation |
10 URLs |
100 URLs |
1,000 URLs |
10,000 URLs |
| Standard Sitemap |
12.5 µs |
120 µs |
1.2 ms |
12.2 ms |
| Build Bytes |
5.7 µs |
51 µs |
519 µs |
5.2 ms |
| Build Compressed |
25 µs |
180 µs |
1.8 ms |
18 ms |
| Image Sitemap (2 imgs) |
28 µs |
275 µs |
2.7 ms |
- |
| Video Sitemap |
45 µs |
450 µs |
4.5 ms |
- |
Compression Ratios: 31x (100 URLs) to 54x (10,000 URLs) - saves 96-98% bandwidth
Proven Results:
Documentation:
Run benchmarks yourself:
cargo bench
open target/criterion/report/index.html
cargo run --example memory_demo --release
Validation
The library automatically validates:
- URL format (RFC 3986)
- Maximum 50,000 URLs per standard sitemap (1,000 for news sitemaps)
- Maximum 50MB uncompressed size
- URL length (max 2048 characters)
- Date format (W3C Datetime)
- Priority values (0.0 to 1.0)
- Video duration (max 28,800 seconds)
- Video rating (0.0 to 5.0)
- Video title length (max 100 characters)
- Video description length (max 2048 characters)
- News language codes (ISO 639 format)
- Stock tickers (max 5, comma-separated)
You can disable validation if needed:
let builder = SitemapBuilder::new().validate(false);
Sitemap Protocol Compliance
This library follows the sitemaps.org protocol 0.9 specification:
- Standard sitemap with
<urlset> and <url> elements
- Optional
<lastmod>, <changefreq>, and <priority> elements
- Image extension:
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
- Video extension:
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
- News extension:
xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
- Sitemap index with
<sitemapindex> and <sitemap> elements
Examples
See the examples directory for more complete examples:
Run examples with:
cargo run --example basic_sitemap
cargo run --example image_sitemap
cargo run --example video_sitemap
cargo run --example news_sitemap
cargo run --example combined_sitemap
cargo run --example sitemap_index
cargo run --example web_framework_usage
cargo run --example memory_demo --release
Testing
Run the test suite:
cargo test
License
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
References