Available on crate feature
rag only.Expand description
Document chunking strategies.
This module provides the Chunker trait and three implementations:
FixedSizeChunker— splits by character count with configurable overlapRecursiveChunker— splits hierarchically by paragraphs, sentences, then wordsMarkdownChunker— splits by markdown headers, preserving header context
Structs§
- Fixed
Size Chunker - Splits text into fixed-size chunks by character count with configurable overlap.
- Markdown
Chunker - Splits text by markdown headers, keeping each section as a chunk.
- Recursive
Chunker - Splits text hierarchically: paragraphs → sentences → words.
Traits§
- Chunker
- A strategy for splitting documents into chunks.