pub fn chunk_document(
doc: &Document,
config: &ChunkConfig,
) -> Vec<DocumentChunk>Expand description
Split a document into token-budgeted chunks.
The algorithm:
- Flatten the document’s section tree into candidates
- Greedily pack candidates into chunks without exceeding
max_tokens - Tables and lists are never split across chunks
- Breaks prefer section boundaries (headings)
- For overlap, the last section title from the previous chunk is included as context