Skip to main content

Module text

Module text 

Source
Expand description

Text processing and chunking

Re-exports§

pub use semantic_chunking::SemanticChunk;
pub use semantic_chunking::SemanticChunker;
pub use semantic_chunking::SemanticChunkerConfig;
pub use semantic_chunking::BreakpointStrategy;
pub use document_structure::DocumentStructure;
pub use document_structure::Heading;
pub use document_structure::Section;
pub use document_structure::HeadingHierarchy;
pub use document_structure::SectionNumber;
pub use document_structure::SectionNumberFormat;
pub use document_structure::StructureStatistics;
pub use analysis::TextAnalyzer;
pub use analysis::TextStats;
pub use keyword_extraction::TfIdfKeywordExtractor;
pub use extractive_summarizer::ExtractiveSummarizer;
pub use layout_parser::LayoutParser;
pub use layout_parser::LayoutParserFactory;
pub use chunk_enricher::ChunkEnricher;
pub use chunk_enricher::EnrichmentStatistics;
pub use chunking_strategies::HierarchicalChunkingStrategy;
pub use chunking_strategies::SemanticChunkingStrategy;

Modules§

analysis
Text analysis utilities Text analysis utilities for document structure detection
chunk_enricher
Chunk enrichment pipeline Chunk enrichment pipeline
chunking
Text chunking utilities module
chunking_strategies
Trait-based chunking strategies Trait-based chunking strategy implementations
document_structure
Document structure representation Document structure representation for hierarchical parsing
extractive_summarizer
Extractive summarization Real extractive summarization with sentence ranking
keyword_extraction
TF-IDF keyword extraction Real TF-IDF keyword extraction
layout_parser
Layout parser trait Layout parser trait and factory for document structure detection
parsers
Document layout parsers Document layout parsers
semantic_chunking
Semantic chunking based on embedding similarity Semantic Chunking for RAG

Structs§

LanguageDetector
Language detection utilities
TextProcessor
Text processing utilities for chunking and preprocessing