Skip to main content

Module chunk

Module chunk 

Source
Expand description

Tree-sitter based code chunking with sliding-window fallback.

Parses source files into ASTs and extracts semantic chunks at function, class, and method boundaries. For files without recognized semantic structure (or very large fallback chunks), splits into overlapping sliding windows for uniform embedding sizes.

Structs§

ArchivedCodeChunk
An archived CodeChunk
ChunkConfig
Runtime configuration for the chunking pipeline.
CodeChunk
A semantic chunk extracted from a source file.
CodeChunkResolver
The resolver for an archived CodeChunk

Functions§

build_scope_chain
Walk up the AST parent chain collecting structural container names.
chunk_file
Extract semantic chunks from a source file.
chunk_rdf_text
Chunk Turtle/N-Triples/TriG/N-Quads style RDF by statement blocks.
chunk_source_for_path
Chunk a source file according to its path extension.
chunk_text
Split source text into overlapping sliding windows.
extract_signature
Extract the function/method signature from a definition node.
is_rdf_text_extension
Return true for RDF-family text formats without a stable Rust tree-sitter grammar.
minify_whitespace
Reduce indentation waste for embedding by normalizing whitespace.