Skip to main content

Module content_chunk

Module content_chunk 

Source
Expand description

Universal content chunk — the atomic unit of the Context Cortex.

Extends the existing CodeChunk (BM25) with a source dimension so that external data (GitHub issues, Jira tickets, DB schemas, wiki pages) flows through the same pipeline as code: BM25, embeddings, graph, knowledge.

Design principles:

  • Backward-compatible: From<ContentChunk> for CodeChunk preserves the existing BM25 pipeline without changes.
  • Source-aware: ContentSource tags where data came from.
  • Reference-carrying: references links chunks to code files for cross-source graph edges.

Scientific basis: Neocortical column architecture (Mountcastle) — every data source is a “column” processing different input through the same computational template.

Structs§

ContentChunk
A universal content chunk that can represent code, issues, DB schemas, wiki pages, or any other data source.

Enums§

ContentSource
Where a content chunk originated.

Functions§

extract_file_references
Extract file path references from freeform text (issue bodies, PR descriptions). Looks for patterns like src/auth.rs, lib/handler.ts, path/to/file.ext.