Expand description
CST-aware code chunking.
Splits source files into semantically meaningful chunks using the concrete syntax tree (CST) produced by ast-grep/tree-sitter. The algorithm:
- If a CST node fits within
max_chunk_size(non-whitespace chars) -> emit it as a chunk. - If too large -> recurse into named children.
- Adjacent small siblings are merged greedily until the merged size would exceed
max_chunk_size.
Each chunk records its parent symbol (resolved by line-range containment).
Structs§
- Chunk
Config - Configuration for the chunker.
- Code
Chunk - A code chunk produced by the CST-aware chunker.
Functions§
- chunk_
file - Chunk a file using its CST tree.