data-doctor-core 1.0.2

A powerful data validation and cleaning tool for JSON and CSV files
Documentation

DataDoctor Core 🩺

Crates.io License: MIT Documentation

DataDoctor Core is the foundational library powering the DataDoctor tools. It provides a robust, high-performance engine for validating, cleaning, and auto-fixing data in JSON and CSV formats.

✨ Features

  • JSON Validation: Detect and fix trailing commas, unquoted keys, missing commas, and more.
  • CSV Validation: Handle column mismatches, type validation, delimiter detection, and auto-padding/trimming.
  • Rule-Based Engine: Extensible system for defining data quality rules.
  • Type Safety: Built with Rust for safety and performance.
  • Zero-Config Option: Smart defaults for immediate results.

📦 Installation

Add this to your Cargo.toml:

[dependencies]

data-doctor-core = "1.0"

📖 Usage

Validating JSON

use data_doctor_core::json::JsonValidator;
use data_doctor_core::ValidationOptions;

fn main() {
    let json_data = r#"{ "name": "John", "age": 30, }"#; // Note: Trailing comma

    let mut options = ValidationOptions::default();
    options.auto_fix = true;

    let validator = JsonValidator::new();
    let (fixed_content, result) = validator.validate_and_fix(json_data, &options);

    if result.success {
        println!("Fixed JSON: {}", fixed_content);
        // Output: { "name": "John", "age": 30 }
    } else {
        println!("Issues found: {:?}", result.issues);
    }
}

Validating CSV

use data_doctor_core::validate_csv_stream;
use data_doctor_core::ValidationOptions;

fn main() {
    let csv_data = "name,age\nJohn,30\nJane,twenty-five";
    
    let options = ValidationOptions::default();
    let result = validate_csv_stream(csv_data.as_bytes(), &options);

    println!("Total Records: {}", result.stats.total_records);
    println!("Invalid Records: {}", result.stats.invalid_records);
}

🤝 Contributing

This crate is part of the data-doctor workspace. Contributions are welcome on the GitHub repository.

📄 License

This project is licensed under the MIT License.