Expand description
§markdown-chunk
Heading-aware Markdown chunker for RAG ingestion.
Rules:
- A new chunk starts at every ATX heading (
#,##…). - Fenced code blocks (`````) are never split mid-block.
- Headers that produce empty bodies are concatenated with the next.
- Chunks are soft-capped at
max_chars; oversize sections are returned whole (a single 30k-char chapter is one chunk).
Each chunk carries its inherited heading trail so retrieval results show where the snippet came from.
§Example
use markdown_chunk::chunk;
let md = "# Title\n\n## Section A\nbody A\n## Section B\nbody B\n";
// Cap below total size forces a split at the next heading.
let chunks = chunk(md, 20);
assert!(chunks.len() >= 2);Functions§
- chunk
- Split
mdinto chunks at heading boundaries.