pub trait ChunkSizer: Send + Sync {
// Required method
fn size(&self, text: &str) -> usize;
}Expand description
Measures the size of a chunk for size-budget comparisons.
CodeChunker uses a ChunkSizer to decide whether
a node fits within max_chunk_size and whether to merge atomic chunks.
Default: byte length via ByteSizer. Plug in a tokenizer-backed sizer
to size chunks in tokens — match your embedding model’s actual context
limit instead of approximating with bytes.
max_chunk_size is interpreted in whatever unit the sizer returns —
bytes for the default ByteSizer, tokens for a tokenizer-backed sizer.