pub struct BruteForceVectorIndex { /* private fields */ }Expand description
A cosine-similarity brute-force vector index.
Stores L2-normalised f32 vectors in a flat row-major buffer for
cache-friendly linear scans. search(q, k) is O(n * dim) in time
and O(n) in allocations; a min-heap optimisation is not worth the
complexity at the corpus sizes this impl targets (see module docs).
Implementations§
Source§impl BruteForceVectorIndex
impl BruteForceVectorIndex
Sourcepub fn empty(model: impl Into<String>, dim: u32) -> Self
pub fn empty(model: impl Into<String>, dim: u32) -> Self
Construct an empty index for (model, dim).
Agents who want to stream insert rather than build from a repo
can start here. The repo-scan path (Self::build_from_repo)
is the common case.
Sourcepub fn model(&self) -> &str
pub fn model(&self) -> &str
Model identifier this index is bound to (e.g.
"openai:text-embedding-3-small"). Exposed so downstream
consumers (e.g. the KNN-edge derivation in mnem-http’s
GraphCache) can tag their derived artefacts with the same
model string the vectors were indexed under.
Sourcepub const fn dim(&self) -> u32
pub const fn dim(&self) -> u32
Dimensionality of every stored vector. 0 iff the index was
empty()-constructed and never inserted into.
Sourcepub fn points_iter(&self) -> impl Iterator<Item = (NodeId, &[f32])> + '_
pub fn points_iter(&self) -> impl Iterator<Item = (NodeId, &[f32])> + '_
Iterate (node_id, unit_vector_slice) pairs in build order
(canonical Prolly-key order at build time). The returned slice
is borrowed from the flat row-major buffer; every row is
already L2-normalised so cosine == dot product.
Used by mnem-http’s GraphCache KNN-edge fallback to derive
a deterministic KNN-edge substrate when the authored-edges
adjacency is empty (experiment E0 wire activation). Returning a
borrowed slice avoids the per-row to_vec() clone the HNSW
variant pays.
Sourcepub fn try_insert(&mut self, node_id: NodeId, embed: &Embedding) -> bool
pub fn try_insert(&mut self, node_id: NodeId, embed: &Embedding) -> bool
Insert one node’s embedding. The node’s embedding MUST match
self.model and self.dim; mismatched entries are silently
skipped so callers can feed a heterogeneous stream.
Returns true if the vector was indexed, false if it was
skipped (wrong model, wrong dim, absent, or undecodable).
Sourcepub fn build_from_repo(repo: &ReadonlyRepo, model: &str) -> Result<Self, Error>
pub fn build_from_repo(repo: &ReadonlyRepo, model: &str) -> Result<Self, Error>
Build an index over every node at the repo head whose
embedding under model is present in the per-commit sidecar
(Commit.embeddings Prolly tree, keyed by NodeCid). Nodes
without a sidecar entry under model are silently skipped.
The sidecar is the only source of truth: dense vectors live
in a separate tree so nondeterministic producers (e.g. ORT
thread-count drift) cannot leak into NodeCid and break
federated dedup. Operators with repos authored before the
sidecar shipped must run mnem reindex to lift inline
vectors into the sidecar; until then those vectors are
invisible to retrieval.
§Errors
RepoError::Uninitializedif the repo has no head commit.- Store / codec errors walking the node tree, decoding nodes, or walking the embedding sidecar.
crate::error::ObjectError::EmbeddingSizeMismatchif a node carries an embedding whosevector.len()contradictsdim * bytes_per_dtype(dtype).
Trait Implementations§
Source§impl Clone for BruteForceVectorIndex
impl Clone for BruteForceVectorIndex
Source§fn clone(&self) -> BruteForceVectorIndex
fn clone(&self) -> BruteForceVectorIndex
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more