Expand description
Tree-sitter based code chunking with sliding-window fallback.
Parses source files into ASTs and extracts semantic chunks at function, class, and method boundaries. For files without recognized semantic structure (or very large fallback chunks), splits into overlapping sliding windows for uniform embedding sizes.
Structs§
- Archived
Code Chunk - An archived
CodeChunk - Chunk
Config - Runtime configuration for the chunking pipeline.
- Code
Chunk - A semantic chunk extracted from a source file.
- Code
Chunk Resolver - The resolver for an archived
CodeChunk
Functions§
- build_
scope_ chain - Walk up the AST parent chain collecting structural container names.
- chunk_
file - Extract semantic chunks from a source file.
- chunk_
text - Split source text into overlapping sliding windows.
- extract_
signature - Extract the function/method signature from a definition node.
- minify_
whitespace - Reduce indentation waste for embedding by normalizing whitespace.