Expand description
Text processing and chunking
Re-exports§
pub use semantic_chunking::SemanticChunk;pub use semantic_chunking::SemanticChunker;pub use semantic_chunking::SemanticChunkerConfig;pub use semantic_chunking::BreakpointStrategy;pub use document_structure::DocumentStructure;pub use document_structure::Heading;pub use document_structure::Section;pub use document_structure::HeadingHierarchy;pub use document_structure::SectionNumber;pub use document_structure::SectionNumberFormat;pub use document_structure::StructureStatistics;pub use analysis::TextAnalyzer;pub use analysis::TextStats;pub use keyword_extraction::TfIdfKeywordExtractor;pub use extractive_summarizer::ExtractiveSummarizer;pub use layout_parser::LayoutParser;pub use layout_parser::LayoutParserFactory;pub use chunk_enricher::ChunkEnricher;pub use chunk_enricher::EnrichmentStatistics;pub use chunking_strategies::HierarchicalChunkingStrategy;pub use chunking_strategies::SemanticChunkingStrategy;
Modules§
- analysis
- Text analysis utilities Text analysis utilities for document structure detection
- chunk_
enricher - Chunk enrichment pipeline Chunk enrichment pipeline
- chunking
- Text chunking utilities module
- chunking_
strategies - Trait-based chunking strategies Trait-based chunking strategy implementations
- document_
structure - Document structure representation Document structure representation for hierarchical parsing
- extractive_
summarizer - Extractive summarization Real extractive summarization with sentence ranking
- keyword_
extraction - TF-IDF keyword extraction Real TF-IDF keyword extraction
- layout_
parser - Layout parser trait Layout parser trait and factory for document structure detection
- parsers
- Document layout parsers Document layout parsers
- semantic_
chunking - Semantic chunking based on embedding similarity Semantic Chunking for RAG
Structs§
- Language
Detector - Language detection utilities
- Text
Processor - Text processing utilities for chunking and preprocessing