Module embeddings

Expand description

Embedding engine for semantic code search.

Provides dense vector embeddings for code chunks using a local ONNX model (all-MiniLM-L6-v2). Feature-gated under embeddings — falls back gracefully to BM25-only search when the feature or model is not available.

Architecture: WordPieceTokenizer → ONNX Model (rten) → Mean Pooling → L2 Normalize → Vec<f32>

Modules§

download: Automatic model download from HuggingFace Hub.
pooling: Pooling strategies for transformer hidden states.
tokenizer: Minimal WordPiece tokenizer for BERT-style embedding models.

Structs§

EmbeddingEngine

Functions§

cosine_similarity: Compute cosine similarity between two L2-normalized vectors. Both vectors must be pre-normalized for correct results.
cosine_similarity_raw: Compute cosine similarity without requiring pre-normalization.
shared_engine: Global singleton embedding engine. Loaded once, shared across all consumers. Returns None if the embeddings feature is disabled or the model fails to load. NOTE: This function BLOCKS on first call while loading the ONNX model (~25MB). For non-blocking access, use try_shared_engine() instead.
try_shared_engine: Non-blocking variant: returns the engine ONLY if already loaded. Never triggers model loading or download. Safe to call on hot paths.

Module embeddings

Module embeddings Copy item path

Modules§

Structs§

Functions§

Module embeddings