# DataDoctor CLI 🩺
[](https://crates.io/crates/data-doctor-cli)
[](https://crates.io/crates/data-doctor-cli)
[](https://opensource.org/licenses/MIT)
**DataDoctor CLI** is your command-line companion for maintaining data health. It brings the power of the DataDoctor engine directly to your terminal, allowing you to validate, analyze, and repair JSON and CSV files instantly.
---
## 🚀 Installation
### Option 1: Install from Crates.io (Recommended)
If you have Rust installed, this is the easiest way:
```bash
cargo install data-doctor-cli
```
*This installs the `data-doctor` binary to your path.*
### Option 2: Build from Source
```bash
git clone https://github.com/jeevanms003/data-doctor.git
cd data-doctor
cargo install --path cli
```
---
## 🎮 How It Works
DataDoctor provides three primary modes of operation, designed for different workflows:
### 1. `validate` (The Checkup)
**Best for:** CI/CD pipelines, pre-commit hooks, or just checking file integrity.
This command scans your file and reports issues without modifying anything. It returns a non-zero exit code if errors are found, making it perfect for automated scripts.
```bash
data-doctor validate users.csv
```
### 2. `fix` (The Surgery)
**Best for:** Cleaning messy data dumps, fixing "broken" JSON from APIs.
This command actively repairs the file and saves the clean version to a new output path. It applies all available auto-fix strategies (e.g., adding missing quotes, padding columns).
```bash
data-doctor fix broken_data.json --out clean_data.json
```
### 3. `doctor` (The Full Treatment)
**Best for:** Interactive analysis and reporting.
This runs a validation pass, then an auto-fix pass, and generates a comprehensive report comparing the "before" and "after" states.
```bash
data-doctor doctor input.csv --out fixed.csv
```
---
## 📋 Command Reference
### `validate`
```bash
data-doctor validate <INPUT> [OPTIONS]
```
**Options:**
- `--format <json|csv>`: Force a specific file format (overrides extension detection).
- `--report-json`: Print a machine-readable JSON object instead of the human-readable report.
- `--schema <FILE>`: Validate against a custom schema definition.
### `fix`
```bash
data-doctor fix <INPUT> --out <OUTPUT> [OPTIONS]
```
**Options:**
- `--out <FILE>`: (Required) Where to save the fixed file.
- `--format <json|csv>`: Force specific file format.
### `doctor`
```bash
data-doctor doctor <INPUT> --out <OUTPUT> [OPTIONS]
```
Combines `validate` and `fix` functionalities with detailed logging.
---
## 🔍 What Can It Fix?
### JSON Fixes (Advanced)
| **Broken Structure** | `[ { "a": 1 } }` | `[ { "a": 1 } ]` (Mismatched bracket fix) |
| **Embedded Keys** | `"desc": "val,"key": "v"` | `"desc": "val", "key": "v"` |
| **Numeric Formats** | `{"val": 0xFF, "oct": 0o77}` | `{"val": 255, "oct": 63}` |
| **Invalid Booleans** | `{"active": yes}` | `{"active": true}` |
| **Leading Zeros** | `{"id": 030}` | `{"id": 30}` |
| **Trailing Commas** | `{"a": 1,}` | `{"a": 1}` |
| **Missing Commas** | `{"a": 1 "b": 2}` | `{"a": 1, "b": 2}` |
| **Unquoted Keys** | `{name: "John"}` | `{"name": "John"}` |
| **Single Quotes** | `{'name': 'John'}` | `{"name": "John"}` |
| **Unclosed Brackets** | `[1, 2, 3` | `[1, 2, 3]` |
### CSV Fixes
| **Padding Columns** | `A,B,C`<br>`1,2` | `A,B,C`<br>`1,2,` (Empty added) |
| **Trimming Cols** | `A,B`<br>`1,2,3,4` | `A,B`<br>`1,2` (Extras removed) |
| **Booleans** | `Yes, No` | `true, false` |
| **Whitespace** | ` Value ` | `Value` |
---
## 📊 JSON Reports
For integration with other tools (like dashboards), use `--report-json`.
**Command:**
```bash
data-doctor validate data.csv --report-json
```
**Output:**
```json
{
"success": false,
"total_records": 100,
"invalid_records": 5,
"issues": [
{
"severity": "Error",
"code": "CSV_TYPE_MISMATCH",
"message": "Invalid Integer value",
"row": 42,
"column": 2
}
]
}
```
---
## 📄 License
This project is licensed under the MIT License.