DataDoctor CLI 🩺
DataDoctor CLI is your command-line companion for maintaining data health. It brings the power of the DataDoctor engine directly to your terminal, allowing you to validate, analyze, and repair JSON and CSV files instantly.
🚀 Installation
Option 1: Install from Crates.io (Recommended)
If you have Rust installed, this is the easiest way:
This installs the data-doctor binary to your path.
Option 2: Build from Source
🎮 How It Works
DataDoctor provides three primary modes of operation, designed for different workflows:
1. validate (The Checkup)
Best for: CI/CD pipelines, pre-commit hooks, or just checking file integrity.
This command scans your file and reports issues without modifying anything. It returns a non-zero exit code if errors are found, making it perfect for automated scripts.
2. fix (The Surgery)
Best for: Cleaning messy data dumps, fixing "broken" JSON from APIs.
This command actively repairs the file and saves the clean version to a new output path. It applies all available auto-fix strategies (e.g., adding missing quotes, padding columns).
3. doctor (The Full Treatment)
Best for: Interactive analysis and reporting.
This runs a validation pass, then an auto-fix pass, and generates a comprehensive report comparing the "before" and "after" states.
📋 Command Reference
validate
Options:
--format <json|csv>: Force a specific file format (overrides extension detection).--report-json: Print a machine-readable JSON object instead of the human-readable report.--schema <FILE>: Validate against a custom schema definition.
fix
Options:
--out <FILE>: (Required) Where to save the fixed file.--format <json|csv>: Force specific file format.
doctor
Combines validate and fix functionalities with detailed logging.
🔍 What Can It Fix?
JSON Fixes (Advanced)
| Issue | Example (Before) | Example (After) |
|---|---|---|
| Broken Structure | [ { "a": 1 } } |
[ { "a": 1 } ] (Mismatched bracket fix) |
| Embedded Keys | "desc": "val,"key": "v" |
"desc": "val", "key": "v" |
| Numeric Formats | {"val": 0xFF, "oct": 0o77} |
{"val": 255, "oct": 63} |
| Invalid Booleans | {"active": yes} |
{"active": true} |
| Leading Zeros | {"id": 030} |
{"id": 30} |
| Trailing Commas | {"a": 1,} |
{"a": 1} |
| Missing Commas | {"a": 1 "b": 2} |
{"a": 1, "b": 2} |
| Unquoted Keys | {name: "John"} |
{"name": "John"} |
| Single Quotes | {'name': 'John'} |
{"name": "John"} |
| Unclosed Brackets | [1, 2, 3 |
[1, 2, 3] |
CSV Fixes
| Issue | Before | After |
|---|---|---|
| Padding Columns | A,B,C1,2 |
A,B,C1,2, (Empty added) |
| Trimming Cols | A,B1,2,3,4 |
A,B1,2 (Extras removed) |
| Booleans | Yes, No |
true, false |
| Whitespace | Value |
Value |
📊 JSON Reports
For integration with other tools (like dashboards), use --report-json.
Command:
Output:
📄 License
This project is licensed under the MIT License.