Expand description
Embedding engine for semantic code search.
Provides dense vector embeddings for code chunks using a local ONNX model.
Supports multiple models via EmbeddingModel registry — selected via
LEAN_CTX_EMBEDDING_MODEL env var (default: all-MiniLM-L6-v2).
Feature-gated under embeddings — falls back gracefully to BM25-only
search when the feature or model is not available.
Architecture:
Tokenizer → ONNX Model (rten) → Mean Pooling → L2 Normalize → Vec<f32>
Modules§
- download
- Automatic model download from HuggingFace Hub.
- model_
registry - Embedding model registry — model configs, selection, and metadata.
- pooling
- Pooling strategies for transformer hidden states.
- tokenizer
- Minimal WordPiece tokenizer for BERT-style embedding models.
Structs§
Functions§
- cosine_
similarity - Compute cosine similarity between two L2-normalized vectors. Both vectors must be pre-normalized for correct results.
- cosine_
similarity_ raw - Compute cosine similarity without requiring pre-normalization.
- shared_
engine - Global singleton embedding engine. Loaded once, shared across all consumers.
Returns None if the embeddings feature is disabled or the model fails to load.
NOTE: This function BLOCKS on first call while loading the ONNX model.
For non-blocking access, use
try_shared_engine()instead. - try_
shared_ engine - Non-blocking variant: returns the engine ONLY if already loaded. Never triggers model loading or download. Safe to call on hot paths.