Skip to main content

Crate llm_transpile

Crate llm_transpile 

Source
Expand description

§llm-transpiler

A high-performance Rust library that converts raw documents (Markdown, HTML, Plain Text, Tables, etc.) into a structured bridge format so LLM agents can receive maximum information with minimum tokens.

§Quick Start

use llm_transpile::{transpile, FidelityLevel, InputFormat};

let md = "# Contract\n\nThis agreement was concluded in 2024.";
let result = transpile(md, InputFormat::Markdown, FidelityLevel::Semantic, Some(4096))
    .expect("transpile failed");
println!("{}", result);

§Streaming Usage

use llm_transpile::{transpile_stream, FidelityLevel, InputFormat};
use futures::StreamExt;

async fn example() {
    let md = "# Document\n\nThis is a paragraph.";
    let mut stream = transpile_stream(md, InputFormat::Markdown, FidelityLevel::Semantic, 4096).await;
    while let Some(chunk) = stream.next().await {
        let chunk = chunk.expect("stream error");
        print!("{}", chunk.content);
        if chunk.is_final { break; }
    }
}

Structs§

AdaptiveCompressor
Budget-based adaptive document compressor.
CompressionConfig
Context provided when running the compressor.
IRDocument
The complete IR representation of a parsed document.
StreamingTranspiler
Tokio channel-based streaming transpiler.
SymbolDict
Bidirectional mapping table between technical terms and PUA symbols.
TranspileChunk
A single output unit produced by the streaming transpiler.

Enums§

CompressionStage
Compression stage enumeration.
DocNode
A semantic unit that makes up a document.
FidelityLevel
The degree of information loss permitted during document conversion.
InputFormat
Input document format.
StreamError
Streaming transpile error.
TranspileError
Transpile error.

Constants§

MAX_INPUT_BYTES
Maximum input size accepted by transpile and transpile_stream. Inputs larger than this limit are rejected with TranspileError::InputTooLarge to prevent resource exhaustion on unbounded documents.

Functions§

build_yaml_header
Builds a YAML header block from the IRDocument’s metadata.
linearize_table
Converts a table into token-efficient text.
render_full
Renders an entire IRDocument as a bridge-format string.
render_node
Renders a single DocNode as bridge text.
token_count
Returns the approximate token count for the given text.
transpile
Converts a document synchronously into the bridge format.
transpile_stream
Converts a document into a Tokio stream.