kreuzberg 4.8.6

High-performance document intelligence library for Rust. Extract text, metadata, and structured data from PDFs, Office documents, images, and 91+ formats and 248 programming languages via tree-sitter code intelligence with async/sync APIs.
Documentation
1
2
3
4
5
6
// Re-export from the canonical implementation in crate::utils::string_utils.
// The text module previously maintained its own duplicate encoding cache (HashMap,
// DefaultHasher, flat 1000-entry cap). All functionality now lives in
// crate::utils::string_utils which uses LRU eviction, AHasher, and env-configurable
// limits.
pub use crate::utils::string_utils::{calculate_text_confidence, fix_mojibake, safe_decode};