Expand description
Chunking strategies for RLM-RS.
This module provides a trait-based system for chunking text content into processable segments. Multiple strategies are available:
- Fixed: Simple character-based chunking with configurable size and overlap
- Semantic: Unicode-aware chunking respecting sentence/paragraph boundaries
- Code: Language-aware chunking at function/class boundaries
- Parallel: Orchestrator for parallel chunk processing
Re-exports§
pub use code::CodeChunker;pub use fixed::FixedChunker;pub use parallel::ParallelChunker;pub use semantic::SemanticChunker;pub use traits::ChunkMetadata as ChunkerMetadata;pub use traits::Chunker;
Modules§
- code
- Code-aware chunking strategy.
- fixed
- Fixed-size chunking strategy.
- parallel
- Parallel chunking orchestrator.
- semantic
- Semantic chunking strategy.
- traits
- Chunker trait definition.
Constants§
- DEFAULT_
CHUNK_ SIZE - Default chunk size in characters (~750 tokens at 4 chars/token). Sized for granular semantic search with embeddings.
- DEFAULT_
OVERLAP - Default overlap size in characters (for context continuity).
- MAX_
CHUNK_ SIZE - Maximum allowed chunk size (50k chars, ~12.5k tokens).
Functions§
- available_
strategies - Lists available chunking strategy names.
- create_
chunker - Creates a chunker by name.
- default_
chunker - Creates the default chunker (semantic).