LiteDoc
Deterministic document format for AI agents and LLM output. Explicit block fencing, zero-copy parsing, and error recovery.
Why LiteDoc?
Markdown is ambiguous. Indentation rules vary, edge cases abound, and parsers disagree.[1] LiteDoc uses explicit ::block fencing for deterministic parsing that recovers gracefully from malformed input, which is ideal for machine-generated content in LLM pipelines where output parsing can fail when formatting is off.[2]
Status
v0.1.0 - Initial release with Rust parser and CLI. APIs are intended to be stable within v0.1 but may evolve.
Stability & Compatibility
LiteDoc follows semantic versioning. For the v0.1 line, we will not make breaking changes to the core format or public APIs without a version bump and a migration note in the changelog.
| Document | Description |
|---|---|
| LITEDOC_SPEC.md | Language specification |
| LITEDOC_AST.md | AST reference |
Performance
| Metric | LiteDoc | Markdown | Improvement |
|---|---|---|---|
| Parse speed | 5.451 µs | 5.660 µs | 3.7% faster |
| Inline parsing | 490 ns | 1.313 µs | 63% faster |
| Error recovery | 0.89 | 0.67 | 33% better |
Install
Usage
Rust
use ;
let mut parser = new;
let result = parser.parse_with_recovery;
for block in &result.document.blocks
Python
=
:
:
:
CLI
Format
::list
- First item
- Second item
::
::quote
Quoted text
::
::table
| A | B |
|---|---|
| 1 | 2 |
::
Metadata:
agent: summarizer-v2
task_id: abc123
timestamp: 1704067200
confidence: 0.92
tags: [summary, final]
Benchmarks
CSV output:
ROBUSTNESS_CSV=1
ROBUSTNESS_BENCH_CSV=1
References
[1] CommonMark Spec, “Why is a spec needed?” (notes original Markdown syntax is not unambiguous and implementations diverged). https://spec.commonmark.org/0.31.2/
[2] LangChain docs: OUTPUT_PARSING_FAILURE (example of JSON-in-Markdown parsing failures). https://docs.langchain.com/oss/python/langchain/errors/OUTPUT_PARSING_FAILURE
License
Apache-2.0