Module chunking

Module chunking 

Source
Expand description

Text chunking functionality for processing large documents.

This module provides comprehensive text chunking capabilities to handle documents that exceed the language model’s context window. It supports multiple chunking strategies and overlap management to ensure no information is lost during processing.

Structs§

ChunkIterator
Token-based chunk iterator that mimics Python’s ChunkIterator behavior
ChunkResult
Result from processing a single chunk
ChunkingConfig
Configuration for text chunking
ResultAggregator
Result aggregator for combining extractions from multiple chunks
TextChunk
A chunk of text with metadata
TextChunker
Text chunker for processing large documents
TokenChunk
A token-based chunk with sophisticated linguistic boundaries

Enums§

ChunkingStrategy
Different strategies for chunking text