Expand description
JSON sanitization for malformed LLM output
This module provides pre-processing functions to fix common JSON syntax errors that LLMs often produce, making the JSON parseable before fuzzy repair.
§Supported Fixes
- Trailing commas:
{"a": 1,}→{"a": 1} - Missing closing braces:
{"a": 1→{"a": 1} - Missing closing brackets:
["a"→["a"]
§Example
use fuzzy_parser::sanitize_json;
// Fix trailing comma
let input = r#"{"name": "test",}"#;
let fixed = sanitize_json(input);
assert_eq!(fixed, r#"{"name": "test"}"#);
// Fix missing closing brace
let input = r#"{"name": "test""#;
let fixed = sanitize_json(input);
assert_eq!(fixed, r#"{"name": "test"}"#);
// Combined with fuzzy repair
use fuzzy_parser::{repair_tagged_enum_json, TaggedEnumSchema, FuzzyOptions};
let schema = TaggedEnumSchema::new("type", &["Action"], |_| Some(&["name"][..]));
let malformed = r#"{"type": "Action", "name": "test",}"#;
let sanitized = sanitize_json(malformed);
let result = repair_tagged_enum_json(&sanitized, &schema, &FuzzyOptions::default()).unwrap();
assert_eq!(result.repaired["name"], "test");§Design Notes
This function performs best-effort sanitization. It handles common cases but does not attempt to fix all possible JSON errors. For severely malformed input, the result may still fail to parse.
The function is designed to be:
- Safe: Never produces worse output than input
- Fast: Single-pass processing where possible
- Predictable: Only fixes well-defined error patterns
Functions§
- sanitize_
json - Sanitize malformed JSON string