json_partial 0.2.1

Json fixing parser for imperfect json given by LLMs
Documentation
# jsonish

> A resilient Rust library for parsing imperfect JSON with automatic error recovery

**jsonish** is a robust JSON parsing library that goes beyond the strict JSON specification. When standard parsers fail, jsonish succeeds by intelligently handling:
- Common syntax errors and typos
- JSON embedded in markdown code blocks
- Multiple JSON objects in a single input
- Unclosed arrays and objects

---

## Key Features

- **Smart Recovery:** Auto-fixes syntax errors while preserving data integrity
- **Markdown Support:** Extracts and parses JSON from code blocks
- **Advanced Parsing:** Handles multiple objects with configurable strategies

## Quick Start

### Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
json_partial = { git = "https://github.com/TwistingTwists/json_partial" }
```

### Basic Usage

```rust
use json_partial::jsonish::{parse, ParseOptions};

// Parse JSON with syntax errors
let input = r#"{ "name": "Bob" "age": 25 }"#;  // missing comma
let value = parse(input, ParseOptions::default())?;

// Handle unclosed arrays
let input = r#"[1, 2, 3"#;  // missing closing bracket
let value = parse(input, ParseOptions::default())?;

// Parse JSON from markdown
let md_input = r#"
```json
{ "key": "value" }
```
Some additional text.
```"#;
let value = parse(md_input, ParseOptions::default())?;
// Returns a Markdown variant wrapping the parsed JSON
```

### Complete Example

```rust
use json_partial::jsonish::{parse, ParseOptions};
use serde::Deserialize;

#[derive(Debug, Deserialize)]
struct Person {
    name: String,
    age: u8,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let input = r#"
    Here is your text 
    { "name": "Alice" "age": 30 }  
    // Note: missing comma
    "#;

    // Parse and convert to serde_json::Value
    let value = parse(input, ParseOptions::default())?;
    let serde_value = jsonish::jsonish_to_serde(&value);
    
    // Deserialize into your struct
    let person: Person = serde_json::from_value(serde_value)?;
    println!("{:?}", person);  // Person { name: "Alice", age: 30 }

    Ok(())
}
```

---

## Features

- **Standard JSON Parsing:**  
  Uses `serde_json` under the hood to parse valid JSON strings quickly and reliably.

- **Error-Tolerant Parsing:**  
  When given imperfect JSON (e.g. missing commas, unquoted keys, unclosed arrays or objects), jsonish will attempt to fix and recover the input rather than immediately failing.

- **Markdown Code Block Extraction:**  
  Supports extracting and parsing JSON from markdown code blocks (e.g. fenced with triple backticks). This is especially useful when working with documents or logs that embed JSON in markdown.

- **Multi-Object Handling:**  
  Can detect and extract multiple JSON objects from a single input, returning them as a combined result.

- **Custom Value Representation:**  
  The parsed output is provided as a custom `Value` enum that includes variants for:
  - **Primitives:** Strings, Numbers, Booleans, and Null.
  - **Complex Structures:** Arrays and Objects.
  - **Special Cases:**  
    - `Markdown`: Represents a code block with a tag and its parsed inner value.
    - `FixedJson`: Wraps JSON that was fixed during parsing, along with a list of applied fixes.
    - `AnyOf`: Holds multiple possible parsed values (useful when multiple parsing strategies succeed).

- **Serde Conversion:**  
  Easily convert jsonish’s custom `Value` to a standard [`serde_json::Value`]https://docs.serde.rs/serde_json/ using the provided `jsonish_to_serde` function.

- **Configurable Parsing Options:**  
  Fine-tune the parsing behavior via the [`ParseOptions`]./jsonish/parser/mod.rs struct, allowing you to enable or disable specific parsing strategies (e.g. markdown parsing, fixing errors, or treating input as a plain string).

---

## API Overview

- **`jsonish::parse`**  
  Main entry point for parsing a JSON (or JSON‐like) string. It applies a series of strategies:
  1. Attempt standard JSON parsing.
  2. If that fails and markdown JSON is allowed, try to extract and parse markdown code blocks.
  3. If enabled, attempt to locate multiple JSON objects.
  4. Apply automatic fixes for common syntax errors.
  5. Fallback to treating the input as a raw string if all else fails.

- **`jsonish::Value`**  
  A custom enum that represents the parsed JSON data with variants for primitive types, objects, arrays, markdown code blocks, fixed JSON (with applied fixes), and a collection of multiple possible parsed values.

- **`jsonish::ParseOptions`**  
  A configurable struct that controls which parsing strategies are enabled. It allows you to adjust settings like whether to allow markdown JSON, auto-fixing, multi-object parsing, and more.

- **`jsonish::to_serde::jsonish_to_serde`**  
  Converts a `jsonish::Value` into a [`serde_json::Value`]https://docs.serde.rs/serde_json/, making it easy to work with other libraries that use serde.

---

## Testing

jsonish comes with a comprehensive suite of tests that verify its ability to handle:

- Valid JSON objects
- JSON with missing commas or unclosed structures
- Nested JSON structures
- JSON embedded within markdown
- Multiple JSON objects within a single input

You can run the tests with:

```bash
cargo test
```

---

## Contributing

Contributions, bug reports, and feature requests are welcome! Feel free to open issues or submit pull requests on [GitHub](https://github.com/TwistingTwists/json_partial).

---

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

---

## Thank you Note 

Lot of the code has been taken from [baml repository](https://github.com/BoundaryML/baml) -> [here](https://github.com/BoundaryML/baml/tree/03735feb5b9e70ad6a872e1c5d0837eea43034df/engine/baml-lib/jsonish/src/jsonish)

Thanks to awesome folks at Baml! 

*Happy parsing!*