pub trait Chunker: Send + Sync {
// Required methods
fn chunk(
&self,
buffer_id: i64,
text: &str,
metadata: Option<&ChunkMetadata>,
) -> Result<Vec<Chunk>>;
fn name(&self) -> &'static str;
// Provided methods
fn supports_parallel(&self) -> bool { ... }
fn description(&self) -> &'static str { ... }
fn validate(&self, metadata: Option<&ChunkMetadata>) -> Result<()> { ... }
}Expand description
Trait for chunking text into processable segments.
Implementations must be Send + Sync to support parallel processing.
Each chunker should produce consistent, deterministic output for the
same input.
§Examples
use rlm_rs::chunking::{Chunker, FixedChunker};
let chunker = FixedChunker::with_size(100);
let text = "Hello, world! ".repeat(20);
let chunks = chunker.chunk(1, &text, None).unwrap();
assert!(!chunks.is_empty());Required Methods§
Sourcefn chunk(
&self,
buffer_id: i64,
text: &str,
metadata: Option<&ChunkMetadata>,
) -> Result<Vec<Chunk>>
fn chunk( &self, buffer_id: i64, text: &str, metadata: Option<&ChunkMetadata>, ) -> Result<Vec<Chunk>>
Chunks the input text into segments.
§Arguments
buffer_id- ID of the source buffer.text- The input text to chunk.metadata- Optional metadata for context-aware chunking.
§Returns
A vector of chunks with byte offsets and metadata.
§Errors
Returns an error if chunking fails (e.g., invalid configuration).
Provided Methods§
Sourcefn supports_parallel(&self) -> bool
fn supports_parallel(&self) -> bool
Returns whether this chunker supports parallel processing.
Default is false. Chunkers that benefit from parallelization
should override this to return true.
Sourcefn description(&self) -> &'static str
fn description(&self) -> &'static str
Returns a description of the chunking strategy.