Struct VectorIndex

Source

pub struct VectorIndex { /* private fields */ }

Expand description

Thread-safe HNSW index over memory embeddings.

Implementations§

Source §

impl VectorIndex

Source

pub fn build(entries: Vec<(String, Vec<f32>)>) -> Self

Build a new index from a list of (memory_id, embedding) pairs.

Source

pub fn empty() -> Self

Build an empty index.

Source

pub fn set_eviction_sink(&self, sink: Sender<EvictionEvent>)

v0.7.0 (R3-S1) — wire the eviction sink.

The daemon calls this once at startup with the send-half of an mpsc channel; a hook-aware observer task drains the recv-half off the hot path and fires the on_index_eviction chain (fire_on_index_eviction in src/hooks/chain.rs). Replacing an existing sink is allowed — useful when the daemon reconfigures the hook chain at runtime — and drops the prior sender, which terminates the prior observer cleanly.

Build-time / CLI / test builds that never wire a sink retain the None default; the eviction path’s try_send then becomes a no-op short-circuit so there is no measurable cost to leaving the sink unset.

Source

pub fn insert(&self, id: String, embedding: Vec<f32>)

Add a new entry to the index (goes to overflow until next rebuild).

Source

pub fn remove(&self, id: &str)

Remove an entry by ID (marks for exclusion; cleaned up on rebuild).

Source

pub fn search(&self, query: &[f32], k: usize) -> Vec<VectorHit>

Search for the k nearest neighbors to the query embedding.

Combines HNSW approximate search with linear scan of overflow entries. Returns results sorted by ascending distance (closest first).

Source

pub fn len(&self) -> usize

Return the total number of indexed entries (HNSW + overflow).

Source

pub fn is_empty(&self) -> bool

true when the index holds no live entries at all.

#1579 QC — load-bearing for the proactive-conflict dispatch: an EMPTY index is vacuously Self::is_fully_searchable (0 + 0 >= 0), but during the async-boot LOAD phase — after the daemon binds with VectorIndex::empty() and before the boot loader’s seed_entries lands (the get_all_embeddings read is the long pole at 100k rows) — emptiness says nothing about what the DB holds. Callers that would otherwise trust a fully-searchable index (the #519 conflict check) must ALSO require non-emptiness, so that window routes to the bounded recency-scan fallback instead of consulting an index that cannot return anything.

Source

pub fn rebuild(&self)

#968 — Force a full rebuild of the HNSW index from all entries, SYNCHRONOUSLY. Preserved for tests + emergency paths; production code should call Self::rebuild_async so the multi-second graph build does not block the calling thread.

Implementation: delegates to rebuild_async and joins the resulting handle so callers retain the v0.6 semantics (“the graph is rebuilt by the time this returns”). Tests rely on this blocking behavior to assert post-rebuild invariants without adding a yield/poll loop.

Source

pub fn rebuild_async(&self) -> JoinHandle<()>

#968 — Schedule a full HNSW rebuild on a background thread and return the JoinHandle for callers that want to observe completion. The build does NOT hold the inner mutex; readers and writers continue to operate against active + overflow while the new graph warms up. On success, the warmed graph lands in the warming slot and is swapped into active by the next reader/writer (or by the foreground rebuild shim’s post-join try_swap_warming call).

Concurrency contract:

At most one rebuild runs at a time (gated by the rebuild_in_flight atomic). A second rebuild_async call while a build is in flight returns a no-op handle (the spawned closure short-circuits if the CAS fails — the in- flight build will pick up the latest entries via the next trigger).
Writes during the build flow into overflow and all_entries normally. The swap path uses the snapshot length captured at spawn time to trim only the overflow entries that are now in the new graph; entries inserted AFTER the snapshot remain in overflow for the next cycle.
Search is unaffected: it reads active + overflow under the inner mutex, both of which remain coherent throughout.

Failure: a panic inside the build thread is observable via JoinHandle::join(); active is unchanged. The rebuild_in_flight flag is cleared by the RebuildGuard drop-guard whether the build succeeded or panicked.

Source

pub fn try_swap_warming(&self) -> bool

#968 — Swap the warming slot into active if a warmed graph is ready. Called opportunistically from search, insert, and the post-join path of the sync rebuild shim. The swap holds the inner mutex for microseconds — just long enough to std::mem::replace the graph and trim the overflow.

Returns true if a swap occurred, false otherwise. Test code uses the return value to verify the swap landed before asserting post-rebuild state.

Source

pub fn is_fully_searchable(&self) -> bool

#1579 — true when a search against this index can observe every live entry: the active graph (its build-time snapshot length) plus the linearly-scanned overflow cover all_entries. false exactly during the async-boot warm window, when Self::seed_entries has parked DB-loaded entries in all_entries but the background graph build has not swapped in yet — sequenced writes flow through insert() (graph- or overflow-visible) so they never break coverage, and removals/evictions only shrink all_entries (stale graph ids are filtered at search time), so the inequality is conservative in the safe direction.

Consumers: the #519 proactive conflict check routes to its bounded-scan fallback while this is false; the boot loader uses it to decide whether a make-up rebuild is needed after a racing routine rebuild swallowed its CAS.

Source

pub fn seed_entries(&self, entries: Vec<(String, Vec<f32>)>) -> usize

#1579 B3 — bulk-load DB-resident entries into the index WITHOUT building the graph (the async-boot path). Entries land in all_entries only; they become searchable when the follow-up rebuild (see Self::seed_and_rebuild_async) swaps its graph in. Ids already present (e.g. a row written through insert() between the caller’s DB snapshot and this call) are skipped so the index never double-counts. Returns the number of entries actually seeded.

Deliberately does NOT enforce max_entries eviction here — the legacy synchronous boot path (VectorIndex::build over get_all_embeddings) never evicted at boot either, and the first post-boot insert() applies the cap exactly as before.

Source

pub fn seed_and_rebuild_async( &self, entries: Vec<(String, Vec<f32>)>, ) -> JoinHandle<()>

#1579 B3 — async-boot warm-up: seed DB-loaded entries (see Self::seed_entries) and schedule the graph build on the existing #968 double-buffer rebuild machinery. Returns the rebuild thread’s JoinHandle; the caller (the boot loader) joins it off the request path and then calls Self::try_swap_warming + emits the operator-visible “index warm” line.

If a routine rebuild is already in flight (its snapshot predates the seed), rebuild_async returns a no-op handle and the seeded entries stay graph-invisible until a later rebuild; the boot loader detects that via Self::is_fully_searchable and issues a make-up rebuild.

Source

pub fn warm_boot(&self, entries: Vec<(String, Vec<f32>)>) -> usize

#1579 B3 — blocking boot warm-up for callers that hold the index directly (the MCP stdio boot thread; tests). Seeds the DB-loaded entries and drives rebuild→swap to completion, returning the number of entries seeded. Each step takes the inner mutex only briefly (the graph build itself runs on the #968 background thread against a snapshot), so concurrent readers/writers on other threads keep making progress — the CALLING thread is the only one parked.

The retry loop covers the rebuild-CAS race: if a routine 200-overflow rebuild was already in flight when our seed landed, rebuild_async short-circuits to a no-op handle and the in-flight build’s pre-seed snapshot cannot cover the seeded rows — is_fully_searchable stays false and the loop schedules a make-up rebuild once the CAS frees up.

Auto Trait Implementations§

§

impl UnwindSafe for VectorIndex

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> ErasedDestructor for T
where T: 'static,

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T> Instrument for T

Source §

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

Source §

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §