Module ann_cache

Expand description

Process-wide cache for the HNSW AnnIndex used by dense semantic search.

Building an HNSW graph is O(n log n) with a wide construction beam, so doing it per query would be slower than brute force. This cache keeps one built index keyed by a content fingerprint of the embedding set: repeated queries over the same corpus reuse the graph and get sub-linear search, while a changed corpus (different fingerprint) transparently triggers a rebuild.

It is threshold-gated — corpora below ANN_MIN_VECTORS skip the cache and use exact SIMD brute-force top-k, which is both faster (no graph overhead) and exact. The threshold is deliberately high: at lean-ctx’s typical scale (a few thousand chunks) exact brute force over int8/SIMD dot products is only ~1-2 ms, so HNSW’s approximate recall is not worth trading. HNSW activates only for genuinely large corpora where exact scan would dominate latency. On any lock failure it falls back to brute force, so correctness never depends on the cache being available.

Constants§

ANN_MIN_VECTORS: Minimum corpus size before an HNSW graph is worth building and caching. Below this, exact SIMD brute force is faster and exact (no recall loss).

Functions§

topk: Returns the top-k (index, similarity) pairs for query over embeddings, sorted by descending similarity.

Module ann_cache

Module ann_cache Copy item path

Constants§

Functions§

Module ann_cache