universal-date-parser 1.0.0

Universal date parser that can parse any date format into standardized output with intelligent format detection
Documentation

Universal Date Parser

Crates.io Documentation License

A comprehensive, high-performance date parsing library that can intelligently parse dates from virtually any format into standardized output. Features automatic format detection, timezone awareness, robust error handling, and multi-language bindings.

โœจ Features

  • Universal Format Support: Parse dates from ISO 8601, RFC 2822, Unix timestamps, US/European formats, and more
  • Intelligent Detection: Automatic format recognition with confidence scoring
  • Timezone Awareness: Flexible timezone handling strategies
  • Ambiguity Resolution: Detect and handle ambiguous date formats (MM/DD vs DD/MM)
  • High Performance: Optimized for speed with comprehensive benchmarks
  • Multi-Language Bindings: C FFI and WebAssembly exports
  • Fuzzy Parsing: Handle malformed or non-standard date inputs
  • Extensive Configuration: Customizable parsing behavior
  • Robust Error Handling: Detailed error messages and recovery strategies

๐Ÿš€ Quick Start

Add to your Cargo.toml:

[dependencies]
universal-date-parser = "1.0"

Basic Usage

use universal_date_parser::{UniversalDateParser, ParserConfig};

fn main() {
    let parser = UniversalDateParser::new();
    
    // Parse various date formats
    let result1 = parser.parse("2023-12-25T15:30:45Z").unwrap();
    println!("ISO 8601: {}", result1.datetime);
    
    let result2 = parser.parse("12/25/2023").unwrap();
    println!("US Format: {}", result2.datetime);
    
    let result3 = parser.parse("1703520645").unwrap();
    println!("Unix Timestamp: {}", result3.datetime);
    
    // Check format detection
    println!("Detected format: {}", result1.detected_format);
    println!("Confidence: {:.2}", result1.confidence);
}

Advanced Configuration

use universal_date_parser::{UniversalDateParser, ParserConfig, TimezoneStrategy};
use chrono::FixedOffset;

fn main() {
    let config = ParserConfig {
        prefer_dmy: true,  // Prefer DD/MM over MM/DD
        strict_mode: false,
        fuzzy_parsing: true,
        timezone_strategy: TimezoneStrategy::UseDefault(
            FixedOffset::east_opt(3600).unwrap() // +01:00
        ),
        ..Default::default()
    };
    
    let parser = UniversalDateParser::with_config(config);
    
    let result = parser.parse("01/02/2023").unwrap();
    println!("Parsed as DD/MM: {}", result.datetime);
    
    if result.metadata.is_ambiguous {
        println!("Alternative interpretations:");
        for alt in &result.metadata.alternatives {
            println!("  - {}", alt);
        }
    }
}

๐Ÿ“… Supported Formats

Standard Formats

  • ISO 8601: 2023-12-25T15:30:45Z, 2023-12-25T15:30:45+02:00
  • RFC 2822: Mon, 25 Dec 2023 15:30:45 +0000
  • Unix Timestamps: 1703520645, 1703520645000 (with milliseconds)

Regional Formats

  • US Format: MM/DD/YYYY โ†’ 12/25/2023
  • European Format: DD/MM/YYYY โ†’ 25/12/2023
  • Flexible Separators: MM-DD-YYYY, DD.MM.YYYY

Natural Language

  • Long Format: December 25, 2023, 25 December 2023
  • Relative Dates: today, yesterday, tomorrow

Custom Formats

  • Configurable Patterns: Add your own regex patterns
  • Format Priority: Control parsing order and preferences

๐Ÿ”ง Configuration Options

pub struct ParserConfig {
    /// Prefer DD/MM over MM/DD when ambiguous
    pub prefer_dmy: bool,
    
    /// Default year when not specified
    pub default_year: Option<i32>,
    
    /// Strict parsing (reject ambiguous dates)
    pub strict_mode: bool,
    
    /// Enable fuzzy parsing for malformed input
    pub fuzzy_parsing: bool,
    
    /// Custom format patterns to try first
    pub custom_patterns: Vec<String>,
    
    /// Timezone handling strategy
    pub timezone_strategy: TimezoneStrategy,
}

Timezone Strategies

pub enum TimezoneStrategy {
    AssumeUtc,           // Default to UTC
    AssumeLocal,         // Use system timezone
    UseDefault(FixedOffset), // Custom default timezone
    RequireTimezone,     // Reject dates without timezone
}

๐ŸŒ Multi-Language Support

C/C++ Integration

#include "universal_date_parser.h"

int main() {
    char* result = parse_date_c("2023-12-25T15:30:45Z");
    if (result != NULL) {
        printf("Parsed: %s\n", result);
        free_string_c(result);
    }
    return 0;
}

WebAssembly/JavaScript

import init, { parse_date_wasm } from './pkg/universal_date_parser.js';

async function parseDate() {
    await init();
    
    const result = parse_date_wasm("2023-12-25T15:30:45Z");
    const parsed = JSON.parse(result);
    
    console.log("Parsed datetime:", parsed.datetime);
    console.log("Detected format:", parsed.detected_format);
    console.log("Confidence:", parsed.confidence);
}

๐Ÿ“Š Performance

Based on comprehensive benchmarks:

Date Parsing Performance:
โ”œโ”€โ”€ ISO 8601 DateTime:    ~245 ns   (4.08M parses/sec)
โ”œโ”€โ”€ US Format:            ~189 ns   (5.29M parses/sec)
โ”œโ”€โ”€ Unix Timestamp:       ~156 ns   (6.41M parses/sec)
โ”œโ”€โ”€ European Format:      ~198 ns   (5.05M parses/sec)
โ””โ”€โ”€ Bulk Parsing (10):    ~1.85 ฮผs  (540K batches/sec)

Configuration Impact:
โ”œโ”€โ”€ Default Mode:         ~189 ns
โ”œโ”€โ”€ Strict Mode:          ~167 ns   (12% faster)
โ””โ”€โ”€ Fuzzy Mode:           ~234 ns   (24% slower)

Run benchmarks yourself:

cargo bench

๐Ÿ› ๏ธ Error Handling

The parser provides detailed error information:

use universal_date_parser::ParseError;

match parser.parse("invalid date") {
    Ok(result) => println!("Parsed: {}", result.datetime),
    Err(ParseError::UnrecognizedFormat(input)) => {
        println!("Could not recognize format: {}", input);
    },
    Err(ParseError::AmbiguousDate(input, alternatives)) => {
        println!("Ambiguous date: {}", input);
        println!("Possible interpretations: {:?}", alternatives);
    },
    Err(ParseError::InvalidDate(msg)) => {
        println!("Invalid date: {}", msg);
    },
    Err(err) => println!("Other error: {:?}", err),
}

๐Ÿงช Testing

Run the comprehensive test suite:

# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run specific test
cargo test test_iso8601_parsing

๐Ÿ“š Examples

Handling Multiple Possibilities

let parser = UniversalDateParser::new();
let possibilities = parser.parse_all_possibilities("01/02/2023");

for (i, result) in possibilities.iter().enumerate() {
    match result {
        Ok(parsed) => {
            println!("Possibility {}: {} ({})", 
                i + 1, 
                parsed.datetime, 
                parsed.detected_format
            );
        }
        Err(e) => println!("Failed to parse: {:?}", e),
    }
}

Custom Format Patterns

let config = ParserConfig {
    custom_patterns: vec![
        r"(\d{2})-(\d{2})-(\d{4})".to_string(), // DD-MM-YYYY
    ],
    ..Default::default()
};

let parser = UniversalDateParser::with_config(config);
let result = parser.parse("25-12-2023").unwrap();

Batch Processing

let parser = UniversalDateParser::new();
let dates = vec![
    "2023-01-01",
    "02/14/2023", 
    "March 15, 2023",
    "1672531200",
];

let results: Vec<_> = dates
    .iter()
    .map(|&date| parser.parse(date))
    .collect();

for (input, result) in dates.iter().zip(results.iter()) {
    match result {
        Ok(parsed) => println!("{} โ†’ {}", input, parsed.datetime),
        Err(e) => println!("{} โ†’ Error: {:?}", input, e),
    }
}

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“ License

This project is licensed under either of

at your option.

๐Ÿ”— Links