Skip to main content

Crate normalize_semantic

Crate normalize_semantic 

Source
Expand description

Semantic retrieval layer for normalize.

This crate provides vector embeddings over structurally-derived chunks (symbols + doc comments + caller/callee context + co-change neighbors), stored in SQLite alongside the structural index, queryable by meaning rather than by name.

§Architecture

  • configEmbeddingsConfig ([embeddings] section of config.toml)
  • chunks – context window construction from index rows
  • embedder – fastembed wrapper (ONNX, no server required)
  • schema – SQLite DDL for the embeddings table
  • store – read/write embeddings to/from SQLite
  • search – ANN search + staleness re-ranking
  • populate – walk the structural index and embed symbols, docs, and commits
  • [service] – CLI service (normalize structure search) – cli feature

§Usage

After structure rebuild, call populate::populate_embeddings with the active FileIndex connection to generate and store embeddings.

For markdown and commit embeddings, call populate::populate_markdown_docs and populate::populate_commit_messages respectively. For .normalize/context/ block embeddings, call populate::populate_context_blocks.

To search, call [service::run_search] (all source types) or [service::run_context_search] (context blocks only), or use store::load_all_embeddings + search::rerank directly.

Re-exports§

pub use config::EmbeddingsConfig;
pub use populate::DEFAULT_MAX_COMMITS;
pub use populate::PopulateStats;
pub use populate::populate_commit_messages;
pub use populate::populate_context_blocks;
pub use populate::populate_embeddings;
pub use populate::populate_incremental_for_paths;
pub use populate::populate_markdown_docs;
pub use search::SearchHit;

Modules§

chunks
Context window construction from the structural index.
config
Configuration for the semantic embeddings subsystem.
embedder
Embedding generation via fastembed (ONNX-backed, no server required).
git_staleness
Git-based staleness computation for symbol embeddings.
populate
Embedding population: walk symbols from the structural index and embed them.
schema
SQLite schema for the embeddings table and sqlite-vec ANN index.
search
ANN search with staleness-based re-ranking.
store
Embedding storage: read/write embeddings from/to the structural index SQLite.
vec_ext
sqlite-vec extension registration.

Functions§

ensure_schema
Ensure the embeddings schema exists in the given connection. Safe to call multiple times (all DDL uses IF NOT EXISTS).
open_index
Open the index and return a reference to its SQLite connection. Convenience helper used by populate and service modules.