Expand description
§llm-transpiler
A high-performance Rust library that converts raw documents (Markdown, HTML, Plain Text, Tables, etc.) into a structured bridge format so LLM agents can receive maximum information with minimum tokens.
§Quick Start
use llm_transpile::{transpile, FidelityLevel, InputFormat};
let md = "# Contract\n\nThis agreement was concluded in 2024.";
let result = transpile(md, InputFormat::Markdown, FidelityLevel::Semantic, Some(4096))
.expect("transpile failed");
println!("{}", result);§Streaming Usage
use llm_transpile::{transpile_stream, FidelityLevel, InputFormat};
use futures::StreamExt;
async fn example() {
let md = "# Document\n\nThis is a paragraph.";
let mut stream = transpile_stream(md, InputFormat::Markdown, FidelityLevel::Semantic, 4096).await;
while let Some(chunk) = stream.next().await {
let chunk = chunk.expect("stream error");
print!("{}", chunk.content);
if chunk.is_final { break; }
}
}Structs§
- Adaptive
Compressor - Budget-based adaptive document compressor.
- Compression
Config - Context provided when running the compressor.
- IRDocument
- The complete IR representation of a parsed document.
- Streaming
Transpiler - Tokio channel-based streaming transpiler.
- Symbol
Dict - Bidirectional mapping table between technical terms and PUA symbols.
- Transpile
Chunk - A single output unit produced by the streaming transpiler.
Enums§
- Compression
Stage - Compression stage enumeration.
- DocNode
- A semantic unit that makes up a document.
- Fidelity
Level - The degree of information loss permitted during document conversion.
- Input
Format - Input document format.
- Stream
Error - Streaming transpile error.
- Transpile
Error - Transpile error.
Constants§
- MAX_
INPUT_ BYTES - Maximum input size accepted by
transpileandtranspile_stream. Inputs larger than this limit are rejected withTranspileError::InputTooLargeto prevent resource exhaustion on unbounded documents.
Functions§
- build_
yaml_ header - Builds a YAML header block from the IRDocument’s metadata.
- linearize_
table - Converts a table into token-efficient text.
- render_
full - Renders an entire IRDocument as a bridge-format string.
- render_
node - Renders a single
DocNodeas bridge text. - token_
count - Returns the approximate token count for the given text.
- transpile
- Converts a document synchronously into the bridge format.
- transpile_
stream - Converts a document into a Tokio stream.