Crate sitemap_generator

Crate sitemap_generator 

Source
Expand description

§Sitemap Generator

A high-performance Rust library for generating XML sitemaps compliant with the sitemaps.org protocol 0.9.

§Features

  • Standard XML Sitemaps: Basic sitemaps with URLs, lastmod, changefreq, and priority
  • Image Sitemaps: Include image metadata (captions, geo-location, titles, licenses)
  • Video Sitemaps: Include video metadata (thumbnails, descriptions, durations, ratings)
  • News Sitemaps: Submit news articles to Google News with publication metadata
  • Combined Sitemaps: Combine multiple extensions (image + video + news) in one sitemap
  • Sitemap Index: Manage multiple sitemap files for large websites (>50k URLs)
  • Validation: Automatic validation of URLs, dates, size limits, and protocol compliance
  • Compression: Built-in gzip compression (96-98% bandwidth savings)
  • Web Framework Support: Direct bytes output for Axum, Actix-web, Rocket, etc.
  • Parsing: Read and parse existing sitemap files
  • Memory Efficient: ~140 bytes/URL during generation, 0 bytes after (proven)
  • High Performance: ~830K URLs/second, immediate memory cleanup
  • Optimized Builders: Pre-allocate capacity with with_capacity() for better performance

§Quick Start Examples

§1. Standard Sitemap

use sitemap_generator::{SitemapBuilder, UrlEntry, ChangeFreq};

// Use with_capacity() if you know the number of URLs in advance
let mut builder = SitemapBuilder::with_capacity(1000);

builder.add_url(UrlEntry::new("https://example.com/")
    .lastmod("2025-11-01")
    .changefreq(ChangeFreq::Daily)
    .priority(1.0));

// Generate as string, bytes, or compressed
let xml = builder.build().unwrap();
let bytes = builder.build_bytes().unwrap();
let compressed = builder.build_compressed_bytes().unwrap();

// Write to file
builder.write("sitemap.xml").unwrap();
builder.write_compressed("sitemap.xml.gz").unwrap();

§2. Image Sitemap

use sitemap_generator::{ImageSitemapBuilder, UrlEntry, UrlWithImages, ImageEntry};

let mut builder = ImageSitemapBuilder::new();

let url_with_images = UrlWithImages::new(
    UrlEntry::new("https://example.com/gallery")
)
.add_image(
    ImageEntry::new("https://example.com/photo1.jpg")
        .title("Beautiful Sunset")
        .caption("A stunning sunset over the ocean")
);

builder.add_url(url_with_images);
let xml = builder.build().unwrap();

§3. Video Sitemap

use sitemap_generator::{VideoSitemapBuilder, UrlEntry, UrlWithVideos, VideoEntry};

let mut builder = VideoSitemapBuilder::new();

let url_with_video = UrlWithVideos::new(
    UrlEntry::new("https://example.com/videos/intro")
)
.add_video(
    VideoEntry::new(
        "https://example.com/thumbnails/intro.jpg",
        "Introduction Video",
        "Learn about our product"
    )
    .content_loc("https://example.com/videos/intro.mp4")
    .duration(300)
    .rating(4.5)
);

builder.add_url(url_with_video);
let xml = builder.build().unwrap();

§4. News Sitemap

use sitemap_generator::{NewsSitemapBuilder, UrlEntry, UrlWithNews, NewsEntry, NewsPublication};

let mut builder = NewsSitemapBuilder::new();

let publication = NewsPublication::new("TechDaily", "en");
let news = NewsEntry::new(
    publication,
    "2025-11-01T10:00:00Z",
    "Breaking Tech News"
)
.keywords("technology, innovation")
.stock_tickers("GOOGL, MSFT");

builder.add_url(UrlWithNews::new(
    UrlEntry::new("https://example.com/news/article"),
    news
));

let xml = builder.build().unwrap();

§5. Combined Sitemap (Multiple Extensions)

use sitemap_generator::{
    CombinedSitemapBuilder, UrlEntry, UrlWithExtensions,
    ImageEntry, VideoEntry, NewsEntry, NewsPublication
};

let mut builder = CombinedSitemapBuilder::new();

// News article with images and video
let news = NewsEntry::new(
    NewsPublication::new("TechDaily", "en"),
    "2025-11-01T10:00:00Z",
    "AI Breakthrough"
);

let image = ImageEntry::new("https://example.com/ai-lab.jpg")
    .title("AI Research Lab");

let video = VideoEntry::new(
    "https://example.com/thumb.jpg",
    "AI Demo",
    "Watch the demo"
)
.duration(180);

builder.add_url(
    UrlWithExtensions::new(UrlEntry::new("https://example.com/article"))
        .add_image(image)
        .add_video(video)
        .set_news(news)
);

let xml = builder.build().unwrap();

§6. Sitemap Index

use sitemap_generator::{SitemapIndexBuilder, SitemapIndexEntry};

let mut builder = SitemapIndexBuilder::new();

builder.add_sitemap(
    SitemapIndexEntry::new("https://example.com/sitemap1.xml.gz")
        .lastmod("2025-11-01")
);

builder.add_sitemap(
    SitemapIndexEntry::new("https://example.com/sitemap2.xml.gz")
        .lastmod("2025-11-02")
);

let xml = builder.build().unwrap();

§Web Framework Integration

Perfect for serving sitemaps from web applications:

use sitemap_generator::{SitemapBuilder, UrlEntry};

// For Axum, Actix-web, Rocket, etc.
async fn sitemap_handler() -> Vec<u8> {
    let mut builder = SitemapBuilder::new();
    builder.add_url(UrlEntry::new("https://example.com/"));

    // Return compressed bytes (saves 96-98% bandwidth)
    builder.build_compressed_bytes().unwrap()
}

§Performance

  • Generation Speed: ~830,000 URLs/second
  • Memory Usage: ~140 bytes per URL during generation, 0 bytes after
  • Compression: 96-98% bandwidth savings with gzip
  • Immediate Cleanup: RAII-based memory management (proven zero leaks)

§Validation

Automatic validation includes:

  • URL format (RFC 3986)
  • Max 50,000 URLs per standard sitemap (1,000 for news sitemaps)
  • Max 50MB uncompressed size
  • Date format (W3C Datetime)
  • Priority values (0.0 to 1.0)
  • Video duration, rating, title/description lengths
  • News language codes (ISO 639), stock tickers (max 5)

Disable validation if needed:

use sitemap_generator::SitemapBuilder;
let builder = SitemapBuilder::new().validate(false);

Re-exports§

pub use error::Error;
pub use error::Result;
pub use validator::Validator;
pub use parser::SitemapParser;
pub use types::*;
pub use builder::*;

Modules§

builder
Builder patterns for creating sitemaps
error
Error types for sitemap generation
parser
Parser for reading sitemap XML files
types
Type definitions for sitemap entries
validator
Validation utilities for sitemap entries
writer
XML writer for sitemaps