Skip to main content

Module similarity

Module similarity 

Source
Expand description

v1.0.76: in-process vector similarity helpers. Replaces the sqlite-vec KNN API with pure-Rust cosine over the BLOB-backed memory_embeddings / entity_embeddings tables. Cosine similarity and ranking helpers for the in-process vector search introduced in v1.0.76.

v1.0.76: the sqlite-vec extension was removed. Vector similarity is computed in pure Rust on the BLOB embeddings stored in memory_embeddings, entity_embeddings, and chunk_embeddings. The performance characteristics are O(N × D) per call where N is the number of rows in the embedding table and D is the embedding dimensionality (default 384). This is acceptable for the tens-of-thousands scale that the GraphRAG memory is designed for; operators with million-scale corpora should partition by namespace and rely on FTS5 for coarse filtering before reaching these helpers.

Functions§

cosine_similarity
Cosine similarity in the range [-1.0, 1.0]. Returns 0.0 when either vector has zero norm. Inputs are NOT mutated.
similarity_to_distance
Converts a cosine similarity to a “distance” in [0.0, 2.0] so the result is compatible with the previous sqlite-vec KNN API. Existing recall / hybrid-search code that interprets distance as “lower is better” can keep doing so without code changes.
top_k_by_score
Returns the top-k (index, score) pairs sorted by score descending. Stable for ties. O(N log k) via a simple sort.