json_partial 0.2.1

Json fixing parser for imperfect json given by LLMs
Documentation

jsonish

A resilient Rust library for parsing imperfect JSON with automatic error recovery

jsonish is a robust JSON parsing library that goes beyond the strict JSON specification. When standard parsers fail, jsonish succeeds by intelligently handling:

  • Common syntax errors and typos
  • JSON embedded in markdown code blocks
  • Multiple JSON objects in a single input
  • Unclosed arrays and objects

Key Features

  • Smart Recovery: Auto-fixes syntax errors while preserving data integrity
  • Markdown Support: Extracts and parses JSON from code blocks
  • Advanced Parsing: Handles multiple objects with configurable strategies

Quick Start

Installation

Add this to your Cargo.toml:

[dependencies]
json_partial = { git = "https://github.com/TwistingTwists/json_partial" }

Basic Usage

use json_partial::jsonish::{parse, ParseOptions};

// Parse JSON with syntax errors
let input = r#"{ "name": "Bob" "age": 25 }"#;  // missing comma
let value = parse(input, ParseOptions::default())?;

// Handle unclosed arrays
let input = r#"[1, 2, 3"#;  // missing closing bracket
let value = parse(input, ParseOptions::default())?;

// Parse JSON from markdown
let md_input = r#"
```json
{ "key": "value" }

Some additional text.

let value = parse(md_input, ParseOptions::default())?;
// Returns a Markdown variant wrapping the parsed JSON

Complete Example

use json_partial::jsonish::{parse, ParseOptions};
use serde::Deserialize;

#[derive(Debug, Deserialize)]
struct Person {
    name: String,
    age: u8,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let input = r#"
    Here is your text 
    { "name": "Alice" "age": 30 }  
    // Note: missing comma
    "#;

    // Parse and convert to serde_json::Value
    let value = parse(input, ParseOptions::default())?;
    let serde_value = jsonish::jsonish_to_serde(&value);
    
    // Deserialize into your struct
    let person: Person = serde_json::from_value(serde_value)?;
    println!("{:?}", person);  // Person { name: "Alice", age: 30 }

    Ok(())
}

Features

  • Standard JSON Parsing:
    Uses serde_json under the hood to parse valid JSON strings quickly and reliably.

  • Error-Tolerant Parsing:
    When given imperfect JSON (e.g. missing commas, unquoted keys, unclosed arrays or objects), jsonish will attempt to fix and recover the input rather than immediately failing.

  • Markdown Code Block Extraction:
    Supports extracting and parsing JSON from markdown code blocks (e.g. fenced with triple backticks). This is especially useful when working with documents or logs that embed JSON in markdown.

  • Multi-Object Handling:
    Can detect and extract multiple JSON objects from a single input, returning them as a combined result.

  • Custom Value Representation:
    The parsed output is provided as a custom Value enum that includes variants for:

    • Primitives: Strings, Numbers, Booleans, and Null.
    • Complex Structures: Arrays and Objects.
    • Special Cases:
      • Markdown: Represents a code block with a tag and its parsed inner value.
      • FixedJson: Wraps JSON that was fixed during parsing, along with a list of applied fixes.
      • AnyOf: Holds multiple possible parsed values (useful when multiple parsing strategies succeed).
  • Serde Conversion:
    Easily convert jsonish’s custom Value to a standard serde_json::Value using the provided jsonish_to_serde function.

  • Configurable Parsing Options:
    Fine-tune the parsing behavior via the ParseOptions struct, allowing you to enable or disable specific parsing strategies (e.g. markdown parsing, fixing errors, or treating input as a plain string).


API Overview

  • jsonish::parse
    Main entry point for parsing a JSON (or JSON‐like) string. It applies a series of strategies:

    1. Attempt standard JSON parsing.
    2. If that fails and markdown JSON is allowed, try to extract and parse markdown code blocks.
    3. If enabled, attempt to locate multiple JSON objects.
    4. Apply automatic fixes for common syntax errors.
    5. Fallback to treating the input as a raw string if all else fails.
  • jsonish::Value
    A custom enum that represents the parsed JSON data with variants for primitive types, objects, arrays, markdown code blocks, fixed JSON (with applied fixes), and a collection of multiple possible parsed values.

  • jsonish::ParseOptions
    A configurable struct that controls which parsing strategies are enabled. It allows you to adjust settings like whether to allow markdown JSON, auto-fixing, multi-object parsing, and more.

  • jsonish::to_serde::jsonish_to_serde
    Converts a jsonish::Value into a serde_json::Value, making it easy to work with other libraries that use serde.


Testing

jsonish comes with a comprehensive suite of tests that verify its ability to handle:

  • Valid JSON objects
  • JSON with missing commas or unclosed structures
  • Nested JSON structures
  • JSON embedded within markdown
  • Multiple JSON objects within a single input

You can run the tests with:

cargo test

Contributing

Contributions, bug reports, and feature requests are welcome! Feel free to open issues or submit pull requests on GitHub.


License

This project is licensed under the MIT License. See the LICENSE file for details.


Thank you Note

Lot of the code has been taken from baml repository -> here

Thanks to awesome folks at Baml!

Happy parsing!