dkit 0.6.0

Swiss army knife for data format conversion and querying
# dkit

**Swiss army knife for data format conversion and querying.**

Convert between JSON, CSV, YAML, TOML, XML, TSV, and MessagePack with a single CLI. Query nested data, compare files, preview as tables, and pipe everything together.

## Quick Start

```bash
# Install
cargo install dkit

# Convert JSON to YAML
dkit convert data.json --to yaml

# Query nested data
dkit query config.yaml '.database.host'

# Preview CSV as a table
dkit view users.csv --limit 10
```

## Installation

### From crates.io

```bash
cargo install dkit
```

### From source

```bash
git clone https://github.com/syangkkim/dkit.git
cd dkit
cargo install --path .
```

## Supported Formats

| Format      | Extensions              | Read | Write |
|-------------|------------------------|------|-------|
| JSON        | `.json`                | O    | O     |
| JSONL       | `.jsonl`, `.ndjson`    | O    | O     |
| CSV         | `.csv`                 | O    | O     |
| TSV         | `.tsv`                 | O    | O     |
| YAML        | `.yaml`, `.yml`        | O    | O     |
| TOML        | `.toml`                | O    | O     |
| XML         | `.xml`                 | O    | O     |
| MessagePack | `.msgpack`             | O    | O     |
| Excel       | `.xlsx`                | O    | -     |
| SQLite      | `.db`, `.sqlite`       | O    | -     |
| Markdown    | `.md`                  | -    | O     |
| HTML        |                        | -    | O     |

All conversion paths between supported read/write formats are available. Excel and SQLite are input-only formats. Markdown and HTML are output-only formats for table rendering.

## Commands

### `convert` — Format conversion

```bash
# Basic conversion
dkit convert data.json --to yaml
dkit convert users.csv --to json
dkit convert config.yaml --to toml
dkit convert config.toml --to json

# XML conversion
dkit convert config.xml --to json
dkit convert data.json --to xml
dkit convert config.xml --to yaml

# JSONL (JSON Lines) conversion
dkit convert users.json --to jsonl              # JSON array → one object per line
dkit convert users.jsonl --to json              # JSONL → JSON array
dkit convert logs.jsonl --to csv                # JSONL → CSV

# Output to file
dkit convert data.json --to csv -o output.csv

# Batch conversion
dkit convert *.csv --to json --outdir ./converted/

# Pipe from stdin
cat data.json | dkit convert --from json --to csv
cat logs.jsonl | dkit convert --from jsonl --to json

# Options
dkit convert data.json --to json --compact     # Minified JSON
dkit convert data.tsv --to json --delimiter '\t'  # TSV input
dkit convert data.csv --to json --no-header    # CSV without header
dkit convert data.json --to xml --root-element users  # Custom XML root element

# Markdown/HTML table output
dkit convert data.json --to md                 # GFM Markdown table
dkit convert data.csv --to html                # HTML table
dkit convert data.json --to html --styled      # HTML with inline CSS
dkit convert data.json --to html --full-html   # Complete HTML document
dkit convert data.json --to html --styled --full-html  # Styled full document

# Excel (.xlsx) input
dkit convert data.xlsx --to json                         # Convert Excel to JSON
dkit convert data.xlsx --to csv --sheet Products         # Specific sheet by name
dkit convert data.xlsx --to yaml --sheet 1               # Specific sheet by index
dkit view data.xlsx --list-sheets                        # List available sheets

# SQLite (.db, .sqlite) input
dkit convert data.db --to json                           # Convert SQLite to JSON
dkit convert data.db --to csv --table users              # Specific table
dkit convert data.db --to json --sql "SELECT * FROM users WHERE age > 25"  # Custom SQL
dkit view data.db --list-tables                          # List available tables

# Encoding support
dkit convert data.csv --to json --encoding euc-kr       # EUC-KR input
dkit convert data.csv --to json --encoding shift_jis     # Shift-JIS input
dkit convert data.csv --to json --detect-encoding        # Auto-detect encoding
```

### `query` — Data querying

```bash
# Field access
dkit query config.yaml '.database.host'
dkit query config.toml '.server.port'

# Nested path
dkit query data.json '.users[0].name'

# Array iteration
dkit query data.json '.users[].email'

# Negative indexing
dkit query data.json '.items[-1]'
```

**Query syntax:**

| Syntax | Description |
|--------|-------------|
| `.field` | Object field access |
| `.field.sub` | Nested field access |
| `.[0]` | Array index (0-based) |
| `.[-1]` | Negative index (from end) |
| `.[]` | Iterate all elements |
| `where .field == value` | Filter with comparison (`==`, `!=`, `>`, `<`, `>=`, `<=`) |
| `where .field contains "str"` | Filter with string operators (`contains`, `starts_with`, `ends_with`) |
| `select .field1, .field2` | Select specific fields |
| `sort .field` / `sort .field desc` | Sort by field (ascending/descending) |
| `limit N` | Limit number of results |
| `\|` | Pipeline chaining (pass results between operations) |

```bash
# Advanced query examples
dkit query data.json '.users[] | where .age > 20 | select .name, .email'
dkit query data.json '.items[] | sort .price desc | limit 5'
dkit query data.json '.users[] | where .name contains "Kim"'

# Output query results in different formats
dkit query data.json '.users[]' --to csv -o users.csv
```

### `view` — Table preview

```bash
# View as table
dkit view users.csv

# Limit rows
dkit view large_data.csv --limit 20

# Navigate nested data
dkit view data.json --path '.users'

# Select columns
dkit view users.csv --columns name,email

# Table customization
dkit view data.csv --border rounded --color        # Rounded borders with type coloring
dkit view data.json --row-numbers --max-width 30   # Row numbers, truncate long values
dkit view data.json --hide-header --border none     # Minimal output
dkit view data.json --border heavy -n 10            # Heavy borders, limit 10 rows

# Output in different formats
dkit view data.json --format json                  # JSON output instead of table
dkit view data.json --format md                    # Markdown table
dkit view data.json --format html                  # HTML table
```

### `stats` — Data statistics

```bash
# Show overall statistics
dkit stats data.csv

# Navigate to nested data
dkit stats data.json --path .users

# Statistics for a specific column
dkit stats data.csv --column revenue
```

### `schema` — Data structure inspection

```bash
# Show schema as a tree
dkit schema config.yaml
dkit schema data.json

# From stdin
cat data.json | dkit schema - --from json
```

### `diff` — Compare data files

```bash
# Compare same-format files
dkit diff old.json new.json
dkit diff config_dev.yaml config_prod.yaml

# Cross-format comparison
dkit diff data.json data.yaml

# Compare nested path only
dkit diff old.json new.json --path '.database'

# Quiet mode (exit code: 0=same, 1=different)
dkit diff a.json b.json --quiet && echo 'same' || echo 'different'
```

### `merge` — Combine multiple files

```bash
# Merge JSON files
dkit merge a.json b.json --to json

# Merge CSV files and convert to JSON
dkit merge users1.csv users2.csv --to json -o merged.json

# Merge YAML configs
dkit merge config1.yaml config2.yaml --to yaml
```

## Comparison with Existing Tools

| Feature | dkit | jq | miller | yq |
|---------|------|-----|--------|----|
| JSON | O | O | O | O |
| CSV/TSV | O | X | O | X |
| YAML | O | X | X | O |
| TOML | O | X | X | X |
| XML | O | X | X | O |
| MessagePack | O | X | X | X |
| Excel (.xlsx) input | O | X | X | X |
| SQLite input | O | X | X | X |
| Markdown/HTML output | O | X | X | X |
| Cross-format convert | O | X | Partial | Partial |
| Table output | O | X | O | X |
| Query (where/select/sort) | O | O | O | O |
| Pipeline chaining | O | O | O | X |
| Statistics | O | X | O | X |
| Schema inspection | O | X | X | X |
| File merging | O | X | O | X |
| File diff | O | X | X | X |
| Multi-encoding support | O | X | X | X |
| Single binary | O | O | O | O |

dkit focuses on **seamless conversion between all supported formats** with a unified query syntax, eliminating the need for separate tools per format.

## Building from Source

```bash
cargo build              # Build
cargo test               # Run tests
cargo clippy -- -D warnings  # Lint
cargo fmt -- --check     # Format check
```

## Contributing

Contributions are welcome! Please see the [GitHub Issues](https://github.com/syangkkim/dkit/issues) for planned features and known issues.

1. Fork the repository
2. Create a feature branch (`git checkout -b feat/my-feature`)
3. Commit your changes
4. Push to the branch and open a Pull Request

Please ensure `cargo test` and `cargo clippy -- -D warnings` pass before submitting.

## License

[MIT](LICENSE)