Crate langextract_rust

Crate langextract_rust 

Source
Expand description

§LangExtract

A Rust library for extracting structured and grounded information from text using LLMs.

This library provides a clean, async API for working with various language model providers to extract structured data from unstructured text.

§Features

  • Support for multiple LLM providers (Gemini, OpenAI, Ollama)
  • Async/await API for concurrent processing
  • Schema-driven extraction with validation
  • Text chunking and tokenization
  • Flexible output formats (JSON, YAML)
  • Built-in visualization and progress tracking

§Quick Start

use langextract_rust::{extract, ExampleData, Extraction, FormatType};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let examples = vec![
        ExampleData {
            text: "John Doe is 30 years old".to_string(),
            extractions: vec![
                Extraction::new("person".to_string(), "John Doe".to_string()),
                Extraction::new("age".to_string(), "30".to_string()),
            ],
        }
    ];

    let result = extract(
        "Alice Smith is 25 years old and works as a doctor",
        Some("Extract person names and ages from the text"),
        &examples,
        Default::default(),
    ).await?;

    println!("{:?}", result);
    Ok(())
}

Re-exports§

pub use data::AlignmentStatus;
pub use data::AnnotatedDocument;
pub use data::CharInterval;
pub use data::Document;
pub use data::ExampleData;
pub use data::Extraction;
pub use data::FormatType;
pub use exceptions::LangExtractError;
pub use exceptions::LangExtractResult;
pub use inference::BaseLanguageModel;
pub use inference::ScoredOutput;
pub use providers::ProviderConfig;
pub use providers::ProviderType;
pub use providers::UniversalProvider;
pub use resolver::ValidationConfig;
pub use resolver::ValidationResult;
pub use resolver::ValidationError;
pub use resolver::ValidationWarning;
pub use resolver::CoercionSummary;
pub use resolver::CoercionDetail;
pub use resolver::CoercionTargetType;
pub use visualization::ExportFormat;
pub use visualization::ExportConfig;
pub use visualization::export_document;

Modules§

alignment
Text alignment functionality for mapping extractions to source text positions.
annotation
Text annotation functionality.
chunking
Text chunking functionality for processing large documents.
data
Core data types for the annotation pipeline.
exceptions
Error types and result definitions for LangExtract.
factory
Factory for creating language model instances.
inference
Language model inference abstractions and implementations.
io
I/O utilities for loading text from various sources.
multipass
Multi-pass extraction system for improved recall and quality.
progress
Progress tracking functionality.
prompting
Advanced prompt template system with dynamic variables and provider adaptation.
providers
Language model provider implementations.
resolver
Output resolution and parsing functionality.
schema
Schema definitions and abstractions for structured prompt outputs.
tokenizer
Text tokenization functionality.
visualization
Visualization utilities for annotated documents.

Structs§

ExtractConfig
Configuration for the extract function

Functions§

extract
Main extraction function that mirrors the Python API
visualize
Visualize function that mirrors the Python API