jshape 0.1.0

Repair malformed JSON and render a stable, human-readable structural outline.
Documentation
# jshape

`jshape` repairs malformed JSON input and renders a stable, human-readable structural outline.

It is designed for large or messy JSON payloads where you want to understand the shape of the data quickly instead of reading the full document.

## What It Does

- Repairs malformed JSON before parsing
- Extracts nested object and array structure
- Preserves example values for scalar fields
- Marks optional object fields with `?`
- Preserves object key order from the input
- Ships with a short CLI command: `jshape`

## Install

As a library:

```toml
[dependencies]
jshape = "0.1.0"
```

As a CLI:

```bash
cargo install jshape
```

## Publish

This repository ships with a GitHub Actions workflow for crates.io publishing.

1. Add a repository secret named `CARGO_REGISTRY_TOKEN`
2. Open the `Publish Crate` workflow in GitHub Actions
3. Click `Run workflow`

The same workflow also publishes automatically when you push a version tag such as `v0.1.0`.

## CLI Usage

Read from a file:

```bash
jshape payload.json
```

Read from stdin:

```bash
cat payload.json | jshape
```

Show types instead of example values:

```bash
jshape --no-examples payload.json
```

## Library Usage

```rust
use jshape::analyze_json;

fn main() {
    let input = r#"{
        'user': {
            "name": "Ada",
            "roles": ["admin", "editor"]
        }
    }"#;

    let outline = analyze_json(input, true).unwrap();
    println!("{}", outline);
}
```

If you need lower-level access, the crate also exposes:

- `repair_and_parse_json`
- `extract_schema`
- `format_schema`
- `Schema`

## Example

The painful case is usually not a tiny nested object. It is a giant export where one array contains thousands of items with almost the same structure, making the raw JSON long and hard to scan.

Below is a trimmed excerpt from a much larger analytics export. In the real file, the `events` array contains tens of thousands of similarly shaped records:

```text
{
  "export_id": "exp_2026_03_27_001",
  "generated_at": "2026-03-27T03:14:15Z",
  "tenant_id": "tenant_42",
  "events": [
    {
      "event_id": "evt_000001",
      "session_id": "sess_a1",
      "user_id": "usr_1001",
      "event_type": "page_view",
      "source": "web",
      "timestamp": "2026-03-27T03:00:01Z",
      "request": {
        "method": "GET",
        "path": "/dashboard",
        "status": 200,
        "duration_ms": 42
      },
      "device": {
        "os": "macOS",
        "browser": "Chrome",
        "locale": "en-US"
      },
      "geo": {
        "country": "US",
        "region": "CA",
        "city": "San Francisco"
      },
      "metrics": {
        "cpu_ms": 12,
        "db_rows": 18,
        "cache_hit": true
      },
      "tags": ["prod", "dashboard", "page_view"]
    },
    {
      "event_id": "evt_000002",
      "session_id": "sess_a1",
      "user_id": "usr_1001",
      "event_type": "page_view",
      "source": "web",
      "timestamp": "2026-03-27T03:00:03Z",
      "request": {
        "method": "GET",
        "path": "/dashboard/usage",
        "status": 200,
        "duration_ms": 57
      },
      "device": {
        "os": "macOS",
        "browser": "Chrome",
        "locale": "en-US"
      },
      "geo": {
        "country": "US",
        "region": "CA",
        "city": "San Francisco"
      },
      "metrics": {
        "cpu_ms": 19,
        "db_rows": 44,
        "cache_hit": true
      },
      "tags": ["prod", "dashboard", "page_view"]
    },
    {
      "event_id": "evt_000003",
      "session_id": "sess_b9",
      "user_id": "usr_2048",
      "event_type": "api_call",
      "source": "api",
      "timestamp": "2026-03-27T03:00:04Z",
      "request": {
        "method": "POST",
        "path": "/v1/reports/query",
        "status": 200,
        "duration_ms": 183
      },
      "device": {
        "os": "Linux",
        "browser": "curl",
        "locale": "en-US"
      },
      "geo": {
        "country": "DE",
        "region": "BE",
        "city": "Berlin"
      },
      "metrics": {
        "cpu_ms": 98,
        "db_rows": 1200,
        "cache_hit": false
      },
      "tags": ["prod", "reports", "api"],
      "error": {
        "code": "RATE_LIMIT_NEAR",
        "retryable": true
      }
    },
    {
      "event_id": "evt_000004",
      "session_id": "sess_c2",
      "user_id": "usr_3099",
      "event_type": "page_view",
      "source": "web",
      "timestamp": "2026-03-27T03:00:05Z",
      "request": {
        "method": "GET",
        "path": "/billing/invoices",
        "status": 200,
        "duration_ms": 61
      },
      "device": {
        "os": "Windows",
        "browser": "Edge",
        "locale": "en-GB"
      },
      "geo": {
        "country": "GB",
        "region": "LND",
        "city": "London"
      },
      "metrics": {
        "cpu_ms": 21,
        "db_rows": 72,
        "cache_hit": true
      },
      "tags": ["prod", "billing", "page_view"]
    },
    ... thousands more records with the same overall shape ...
  ],
  "aggregates": {
    "event_count": 48762,
    "unique_users": 913,
    "time_range": {
      "from": "2026-03-27T00:00:00Z",
      "to": "2026-03-27T03:14:15Z"
    }
  }
}
```

Output after running `jshape`:

```text
{
  "export_id": "exp_2026_03_27_001",
  "generated_at": "2026-03-27T03:14:15Z",
  "tenant_id": "tenant_42",
  "events": [
    {
      "event_id": "evt_000001",
      "session_id": "sess_a1",
      "user_id": "usr_1001",
      "event_type": "page_view", "api_call",
      "source": "web", "api",
      "timestamp": "2026-03-27T03:00:01Z",
      "request": {
        "method": "GET", "POST",
        "path": "/dashboard", "/dashboard/usage", "/v1/reports/query", "/billing/invoices",
        "status": 200,
        "duration_ms": 42, 57, 183, 61
      },
      "device": {
        "os": "macOS", "Linux", "Windows",
        "browser": "Chrome", "curl", "Edge",
        "locale": "en-US", "en-GB"
      },
      "geo": {
        "country": "US", "DE", "GB",
        "region": "CA", "BE", "LND",
        "city": "San Francisco", "Berlin", "London"
      },
      "metrics": {
        "cpu_ms": 12, 19, 98, 21,
        "db_rows": 18, 44, 1200, 72,
        "cache_hit": bool
      },
      "tags": [
        "prod", "dashboard", "page_view", "reports", "api", "billing"
      ],
      "error"?: {
        "code": "RATE_LIMIT_NEAR",
        "retryable": true
      }
    },
  ...  // 48762 items
  ],
  "aggregates": {
    "event_count": 48762,
    "unique_users": 913,
    "time_range": {
      "from": "2026-03-27T00:00:00Z",
      "to": "2026-03-27T03:14:15Z"
    }
  }
}
```

The difference is the point of the tool: the raw input repeats the same object shape thousands of times, while the output keeps one representative structure, marks optional fields, preserves a few concrete values, and tells you how large the array really is.

## Notes

- Output is JSON-like and stable for inspection, but it is not guaranteed to be valid JSON in every mode.
- Optional fields are rendered with a trailing `?`.
- Large arrays are summarized instead of printing every element in full.
- This crate currently relies on `json-repair = 0.4.0`, which requires a nightly-compatible build environment or `RUSTC_BOOTSTRAP=1`.