url-preview 0.4.0

High-performance URL preview generator for messaging and social media applications
Documentation

Crates.io Documentation License: MIT

URL Preview

A high-performance Rust library for generating rich URL previews with specialized support for Twitter/X and GitHub. This library offers efficient caching, concurrent processing, comprehensive metadata extraction, and detailed error reporting.

What's New in v0.4.0

  • Enhanced Error Handling: New specific error types for better error differentiation
  • Invalid URL Detection: Properly detects and reports 404s and invalid resources
  • Improved Performance: Refactored internals for better concurrent processing
  • Better Feature Management: Fixed compilation issues with optional features

Features

  • High Performance: Optimized for speed with concurrent processing and batch operations
  • Smart Caching: Built-in DashMap-based caching system for lightning-fast responses
  • Platform-Specific Handlers:
    • Twitter/X: Specialized handling with oEmbed support
    • GitHub: Enhanced repository information extraction with API integration
  • Flexible Configuration:
    • Customizable HTTP clients with different configurations
    • Adjustable concurrent request limits
    • Configurable cache sizes and strategies
  • Rich Metadata Extraction:
    • Title, description, and images
    • Open Graph and Twitter Card metadata
    • Favicons and site information
  • Advanced Error Handling:
    • Specific error types for DNS, timeout, and HTTP errors
    • Detailed error messages for debugging
    • Proper 404 and invalid resource detection
  • Modern Rust Features:
    • Async/await with Tokio
    • Thread-safe components
    • Zero-cost abstractions

Installation

Add this to your Cargo.toml:

[dependencies]
url-preview = "0.4.0"

# Optional features
url-preview = { version = "0.4.0", features = ["full"] }

# Or select specific features
url-preview = { version = "0.4.0", features = ["cache", "logging", "github", "twitter"] }

Feature Flags

  • default: Basic functionality with default reqwest features
  • cache: Enable caching support with DashMap
  • logging: Enable structured logging with tracing
  • github: Enable GitHub-specific preview enhancements
  • twitter: Enable Twitter/X oEmbed integration
  • full: Enable all features

Quick Start

Here's a simple example to get started:

use url_preview::{PreviewService, Preview, PreviewError};

#[tokio::main]
async fn main() -> Result<(), PreviewError> {
    // Create a preview service with default settings
    let preview_service = PreviewService::new();

    // Generate a preview
    let preview = preview_service
        .generate_preview("https://www.rust-lang.org")
        .await?;

    println!("Title: {:?}", preview.title);
    println!("Description: {:?}", preview.description);
    println!("Image: {:?}", preview.image_url);

    Ok(())
}

Advanced Usage

Error Handling

The library now provides detailed error types for better error handling:

use url_preview::{PreviewService, PreviewError};

#[tokio::main]
async fn main() {
    let service = PreviewService::new();
    
    match service.generate_preview("https://example.com/404").await {
        Ok(preview) => println!("Got preview: {:?}", preview.title),
        Err(PreviewError::NotFound(msg)) => {
            println!("Resource not found: {}", msg);
        }
        Err(PreviewError::DnsError(msg)) => {
            println!("DNS resolution failed: {}", msg);
        }
        Err(PreviewError::TimeoutError(msg)) => {
            println!("Request timed out: {}", msg);
        }
        Err(PreviewError::ServerError { status, message }) => {
            println!("Server error ({}): {}", status, message);
        }
        Err(PreviewError::ClientError { status, message }) => {
            println!("Client error ({}): {}", status, message);
        }
        Err(e) => println!("Other error: {}", e),
    }
}

Batch Processing

Process multiple URLs efficiently:

use url_preview::PreviewService;
use futures::future::join_all;

#[tokio::main]
async fn main() {
    let service = PreviewService::new();
    let urls = vec![
        "https://www.rust-lang.org",
        "https://github.com/rust-lang/rust",
        "https://news.ycombinator.com",
    ];

    // Concurrent processing with proper error handling
    let results = join_all(
        urls.iter().map(|url| service.generate_preview(url))
    ).await;

    for (url, result) in urls.iter().zip(results.iter()) {
        match result {
            Ok(preview) => println!("{}: {:?}", url, preview.title),
            Err(e) => println!("{}: Error - {}", url, e),
        }
    }
}

Custom Configuration

Configure the service with specific requirements:

use url_preview::{PreviewService, PreviewServiceConfig, Fetcher, FetcherConfig};
use std::time::Duration;

let config = PreviewServiceConfig::new(1000) // Cache capacity
    .with_max_concurrent_requests(20)
    .with_default_fetcher(
        Fetcher::new_with_config(FetcherConfig {
            timeout: Duration::from_secs(30),
            user_agent: "my-app/1.0".into(),
            ..Default::default()
        })
    );

let service = PreviewService::new_with_config(config);

Specialized Platform Support

Twitter/X Integration

#[cfg(feature = "twitter")]
{
    let preview = service
        .generate_preview("https://x.com/username/status/123456789")
        .await?;

    // Twitter previews include embedded content
    println!("Tweet content: {:?}", preview.description);
}

GitHub Repository Information

#[cfg(feature = "github")]
{
    // Basic preview
    let preview = service
        .generate_preview("https://github.com/rust-lang/rust")
        .await?;

    // Detailed repository information
    if let Ok(details) = service
        .get_github_detailed_info("https://github.com/rust-lang/rust")
        .await
    {
        println!("Stars: {}", details.stars_count);
        println!("Forks: {}", details.forks_count);
        println!("Language: {}", details.language);
        println!("Open Issues: {}", details.open_issues);
    }
}

Caching Strategies

Control caching behavior for different use cases:

use url_preview::{PreviewService, CacheStrategy};

// Service with caching enabled (default)
let cached_service = PreviewService::new();

// Service with caching disabled
let no_cache_service = PreviewService::no_cache();

// Manual cache operations if needed
#[cfg(feature = "cache")]
{
    let preview = cached_service.generate_preview(url).await?;
    
    // Check if URL is in cache
    let cached = cached_service.default_generator.cache.get(url).await;
}

Logging Configuration

Configure comprehensive logging:

#[cfg(feature = "logging")]
{
    use url_preview::{setup_logging, LogConfig};
    use std::path::PathBuf;

    let log_config = LogConfig {
        log_dir: PathBuf::from("logs"),
        log_level: "info".into(),
        console_output: true,
        file_output: true,
    };

    setup_logging(log_config);
}

Performance Optimization

Concurrent Request Limiting

Control the number of concurrent requests to prevent resource exhaustion:

use url_preview::{PreviewServiceConfig, MAX_CONCURRENT_REQUESTS};

let config = PreviewServiceConfig::new(1000)
    .with_max_concurrent_requests(50); // Default is MAX_CONCURRENT_REQUESTS (500)

let service = PreviewService::new_with_config(config);

Retry Strategy

The library automatically retries on server errors and timeouts:

// The fetcher will automatically retry up to 3 times for:
// - Server errors (5xx)
// - Timeouts
// - Connection errors
// But NOT for client errors (4xx) or DNS failures

Examples

Check the examples/ directory for more comprehensive examples:

  • url_preview.rs - Basic usage and caching demonstration
  • github_preview.rs - GitHub-specific features
  • twitter_preview.rs - Twitter/X integration
  • batch_concurrent.rs - Batch processing examples
  • test_invalid_urls.rs - Error handling examples

Run examples with:

cargo run --example url_preview
cargo run --example github_preview --features github
cargo run --example twitter_preview --features twitter

Benchmarks

The library includes comprehensive benchmarks:

cargo bench

Testing

Run the test suite:

cargo test
cargo test --all-features  # Test with all features enabled

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Development Setup

  1. Clone the repository
  2. Install dependencies: cargo build
  3. Run tests: cargo test
  4. Run benchmarks: cargo bench
  5. Format code: cargo fmt
  6. Run clippy: cargo clippy -- -D warnings

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Testing Websites for Rich Results

Google provides the Rich Results Analysis Tool to help you validate your website's tags.

Use this tool to ensure your website follows the conventions for optimal preview generation.