Skip to main content

Crate llmx

Crate llmx 

Source
Expand description

Utilities for working with LLM outputs.

Currently this crate focuses on turning “fuzzy” JSON-like text (common in LLM responses) into real serde_json::Value.

§Example

use llmx::json::parse_fuzzy_json;

let v = parse_fuzzy_json("```json\n{'a': True, b: None}\n```").unwrap();
assert_eq!(v["a"], true);
assert!(v["b"].is_null());

§Streaming JSON Parsing

For streaming LLM outputs, use StreamingJsonParser to parse incomplete JSON at any point in time:

use llmx::json::StreamingJsonParser;

let mut parser = StreamingJsonParser::new();

// Simulate streaming input
parser.push(r#"{"name":"#);
let v = parser.parse_partial().unwrap();
assert!(v.is_object());

parser.push(r#""Alice","age":30}"#);
let v = parser.parse_partial().unwrap();
assert_eq!(v["name"], "Alice");
assert_eq!(v["age"], 30);

§Streaming Text Metrics

For streaming chat/completions outputs, use StreamingTextAccumulator to collect the final text and compute simple throughput metrics. If you want tokens/s without passing token counts, use StreamingTextTokenCounter. You can also detect stalls / slow streams:

use llmx::{
    StreamingTextAccumulator, StreamingTextMonitorConfig, StreamingTextMonitorState,
    StreamingTextTokenCounter,
};
use std::time::Duration;

let mut acc = StreamingTextAccumulator::new()
    .with_provider("openai")
    .with_model("gpt-4o-mini");

// 近似:tokens ≈ chars/4;更准确可启用 `tiktoken` feature 并使用 Tiktoken* 变体。
let mut acc = acc.with_token_counter(StreamingTextTokenCounter::ApproxChars {
    chars_per_token: 4.0,
});
acc.push("Hello");
acc.push(", world!");

let report = acc.monitor(
    StreamingTextMonitorConfig::new()
        .with_stall_timeout(Duration::from_secs(5))
        .with_slow_window(Duration::from_secs(10))
        .with_min_tokens_per_sec(1.0),
);
assert_ne!(report.state, StreamingTextMonitorState::Stalled);

acc.finish();

assert_eq!(acc.text(), "Hello, world!");

Re-exports§

pub use stream::StreamingTextAccumulator;
pub use stream::StreamingTextMonitorConfig;
pub use stream::StreamingTextMonitorReport;
pub use stream::StreamingTextMonitorState;
pub use stream::StreamingTextRate;
pub use stream::StreamingTextSummary;
pub use stream::StreamingTextTokenCounter;

Modules§

json
stream