Expand description
§LangExtract
A Rust library for extracting structured and grounded information from text using LLMs.
This library provides a clean, async API for working with various language model providers to extract structured data from unstructured text.
§Features
- Support for multiple LLM providers (Gemini, OpenAI, Ollama)
- Async/await API for concurrent processing
- Schema-driven extraction with validation
- Text chunking and tokenization
- Flexible output formats (JSON, YAML)
- Built-in visualization and progress tracking
§Quick Start
use langextract_rust::{extract, ExampleData, Extraction, FormatType};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let examples = vec![
ExampleData {
text: "John Doe is 30 years old".to_string(),
extractions: vec![
Extraction::new("person".to_string(), "John Doe".to_string()),
Extraction::new("age".to_string(), "30".to_string()),
],
}
];
let result = extract(
"Alice Smith is 25 years old and works as a doctor",
Some("Extract person names and ages from the text"),
&examples,
Default::default(),
).await?;
println!("{:?}", result);
Ok(())
}Re-exports§
pub use config::LangExtractConfig;pub use config::ProcessingConfig;pub use config::ValidationConfig as NewValidationConfig;pub use config::ChunkingConfig;pub use config::AlignmentConfig as NewAlignmentConfig;pub use config::MultiPassConfig as NewMultiPassConfig;pub use config::VisualizationConfig;pub use config::InferenceConfig as NewInferenceConfig;pub use config::ProgressConfig;pub use config::ChunkingStrategy;pub use config::ExportFormat as NewExportFormat;pub use data::AlignmentStatus;pub use data::AnnotatedDocument;pub use data::CharInterval;pub use data::Document;pub use data::ExampleData;pub use data::Extraction;pub use data::FormatType;pub use exceptions::LangExtractError;pub use exceptions::LangExtractResult;pub use inference::BaseLanguageModel;pub use inference::ScoredOutput;pub use logging::ProgressHandler;pub use logging::ProgressEvent;pub use logging::ConsoleProgressHandler;pub use logging::SilentProgressHandler;pub use logging::LogProgressHandler;pub use providers::ProviderConfig;pub use providers::ProviderType;pub use providers::UniversalProvider;pub use resolver::ValidationConfig;pub use resolver::ValidationResult;pub use resolver::ValidationError;pub use resolver::ValidationWarning;pub use resolver::CoercionSummary;pub use resolver::CoercionDetail;pub use resolver::CoercionTargetType;pub use visualization::ExportFormat;pub use visualization::ExportConfig;pub use visualization::export_document;pub use pipeline::PipelineConfig;pub use pipeline::PipelineStep;pub use pipeline::PipelineResult;pub use pipeline::PipelineExecutor;
Modules§
- alignment
- Text alignment functionality for mapping extractions to source text positions.
- annotation
- Text annotation functionality.
- chunking
- Text chunking functionality for processing large documents.
- config
- Unified configuration system for LangExtract.
- data
- Core data types for the annotation pipeline.
- exceptions
- Error types and result definitions for LangExtract.
- factory
- Factory for creating language model instances.
- inference
- Language model inference abstractions and implementations.
- io
- I/O utilities for loading text from various sources.
- logging
- Logging and progress reporting system for LangExtract.
- multipass
- Multi-pass extraction system for improved recall and quality.
- pipeline
- Pipeline processing for multi-step information extraction.
- progress
- Progress tracking functionality.
- prompting
- Advanced prompt template system with dynamic variables and provider adaptation.
- providers
- Language model provider implementations.
- resolver
- Output resolution and parsing functionality.
- schema
- Schema definitions and abstractions for structured prompt outputs.
- templates
- Template engine and utilities for LangExtract.
- tokenizer
- Text tokenization functionality.
- visualization
- Visualization utilities for annotated documents.
Macros§
- progress_
debug - progress_
error - progress_
info - Convenience macros for common progress events
Structs§
- Extract
Config - Configuration for the extract function
Functions§
- extract
- Main extraction function that mirrors the Python API
- extract_
with_ config - Convenient extraction function using the new unified configuration
- visualize
- Visualize function that mirrors the Python API