Expand description
Universal content chunk — the atomic unit of the Context Cortex.
Extends the existing CodeChunk (BM25) with a source dimension so that
external data (GitHub issues, Jira tickets, DB schemas, wiki pages) flows
through the same pipeline as code: BM25, embeddings, graph, knowledge.
Design principles:
- Backward-compatible:
From<ContentChunk> for CodeChunkpreserves the existing BM25 pipeline without changes. - Source-aware:
ContentSourcetags where data came from. - Reference-carrying:
referenceslinks chunks to code files for cross-source graph edges.
Scientific basis: Neocortical column architecture (Mountcastle) — every data source is a “column” processing different input through the same computational template.
Structs§
- Content
Chunk - A universal content chunk that can represent code, issues, DB schemas, wiki pages, or any other data source.
Enums§
- Content
Source - Where a content chunk originated.
Functions§
- extract_
file_ references - Extract file path references from freeform text (issue bodies, PR descriptions).
Looks for patterns like
src/auth.rs,lib/handler.ts,path/to/file.ext.