Module data

Module data 

Source
Expand description

Core data types for the annotation pipeline.

This module defines the fundamental data structures used throughout the langextract library, including documents, extractions, and configuration types.

Structs§

AnnotatedDocument
Annotated document with extractions
CharInterval
Represents a character interval in text
Document
Document class for input text
ExampleData
Example data for training/prompting
Extraction
Represents an extraction extracted from text
TokenInterval
Token interval information (placeholder for future tokenizer integration)

Enums§

AlignmentStatus
Status indicating how well an extraction aligns with the source text
FormatType
Enumeration of supported output formats