Skip to main content

Module chunks

Module chunks 

Source
Expand description

Context window construction from the structural index.

Each chunk is a rich text window centered on a symbol:

symbol name + signature

  • doc comment (tree-sitter extracted)
  • parent module/crate path
  • callers (top N by frequency)
  • callees
  • co-change neighbors

Additional source types:

  • doc: Markdown files chunked by heading section. Each section becomes one chunk with a breadcrumb of parent headings prepended for context.
  • commit: Git commit messages (subject + body), keyed by commit hash.

The quality of the embedding is directly upstream of the quality of the index. Better extraction -> better context windows -> better embeddings.

Structs§

Chunk
A chunk ready for embedding. Each chunk corresponds to one row in the embeddings table.
SymbolRow
A row from the symbols table with enough data to build a chunk.

Functions§

build_commit_chunk
Build a chunk for a git commit message.
build_markdown_chunk
Build a chunk for a single markdown heading section.
build_symbol_chunk
Build the chunk text for a symbol given its context.
split_markdown_sections
Parse a markdown file into (heading_breadcrumb, body) section pairs.