llm-json-repair
Clean and parse JSON emitted by LLMs.
Models still emit invalid JSON. They wrap it in markdown fences, leave trailing commas, write prose around it, or only emit a half-fence. This crate does three local repair passes before you reach for an LLM retry:
- Strip markdown code fences (
```json … ```,~~~ … ~~~). - Extract the first balanced
{…}or[…]from surrounding prose. - Remove trailing commas before
}or].
Single small dependency on thiserror. Optional serde feature for
typed parsing.
Install
[]
= "0.1"
Use
use repair;
let raw = "Sure, here you go:\n```json\n{\"answer\": \"Paris\", \"confidence\": 0.95,}\n```";
let cleaned = repair;
assert_eq!;
Typed parse with the default serde feature:
use Deserialize;
let raw = "```json\n{\"text\": \"hi\", \"confidence\": 0.5,}\n```";
let a: Answer = parse?;
What it does NOT do
- No LLM retry loop. This is local repair only — wire your own retry path.
- No "fix anything" magic. We do exactly the three passes above. Nothing guesses missing fields, fixes spelling, or rewrites types.
- No regex. Single byte scan for the balanced extractor and the comma stripper.
- No JSON5/HJSON support. If the input has unquoted keys or single-quoted strings, repair won't fix that — by design.
Why a separate crate
Most Rust LLM clients embed their own ad-hoc cleanup or punt to the user. This crate exists so you can pull in one focused dependency instead of copying a regex into every project.
Sibling to bedrock-kit
on the Python side, which does the same job for AWS Bedrock callers.
License
MIT OR Apache-2.0