Skip to main content

split_into_chunks_hierarchical

Function split_into_chunks_hierarchical 

Source
pub fn split_into_chunks_hierarchical(body: &str) -> Vec<Chunk>
Expand description

Splits body into chunks using MarkdownSplitter with a real tokenizer. Respects Markdown semantic boundaries (H1-H6, paragraphs, blocks). For plain text without Markdown markers, falls back to paragraph and sentence breaks.

v1.0.76: the tokenizer parameter was removed. The chunker now uses the char-based heuristic (CHARS_PER_TOKEN = 2) which is the same heuristic the rest of the codebase uses for Chunk::token_count_approx.