Module sanitize

Expand description

JSON sanitization for malformed LLM output

This module provides pre-processing functions to fix common JSON syntax errors that LLMs often produce, making the JSON parseable before fuzzy repair.

§Supported Fixes

Trailing commas: {"a": 1,} → {"a": 1}
Missing closing braces: {"a": 1 → {"a": 1}
Missing closing brackets: ["a" → ["a"]

§Example

use fuzzy_parser::sanitize_json;

// Fix trailing comma
let input = r#"{"name": "test",}"#;
let fixed = sanitize_json(input);
assert_eq!(fixed, r#"{"name": "test"}"#);

// Fix missing closing brace
let input = r#"{"name": "test""#;
let fixed = sanitize_json(input);
assert_eq!(fixed, r#"{"name": "test"}"#);

// Combined with fuzzy repair
use fuzzy_parser::{repair_tagged_enum_json, TaggedEnumSchema, FuzzyOptions};

let schema = TaggedEnumSchema::new("type", &["Action"], |_| Some(&["name"][..]));
let malformed = r#"{"type": "Action", "name": "test",}"#;

let sanitized = sanitize_json(malformed);
let result = repair_tagged_enum_json(&sanitized, &schema, &FuzzyOptions::default()).unwrap();
assert_eq!(result.repaired["name"], "test");

§Design Notes

This function performs best-effort sanitization. It handles common cases but does not attempt to fix all possible JSON errors. For severely malformed input, the result may still fail to parse.

The function is designed to be:

Safe: Never produces worse output than input
Fast: Single-pass processing where possible
Predictable: Only fixes well-defined error patterns

Functions§

sanitize_json: Sanitize malformed JSON string

Module sanitize

Module sanitize Copy item path

§Supported Fixes

§Example

§Design Notes

Functions§

Module sanitize