Skip to main content

VectorIndex

Struct VectorIndex 

Source
pub struct VectorIndex { /* private fields */ }
Expand description

Thread-safe HNSW index over memory embeddings.

Implementations§

Source§

impl VectorIndex

Source

pub fn build(entries: Vec<(String, Vec<f32>)>) -> Self

Build a new index from a list of (memory_id, embedding) pairs.

Source

pub fn empty() -> Self

Build an empty index.

Source

pub fn set_eviction_sink(&self, sink: Sender<EvictionEvent>)

v0.7.0 (R3-S1) — wire the eviction sink.

The daemon calls this once at startup with the send-half of an mpsc channel; a hook-aware observer task drains the recv-half off the hot path and fires the on_index_eviction chain (fire_on_index_eviction in src/hooks/chain.rs). Replacing an existing sink is allowed — useful when the daemon reconfigures the hook chain at runtime — and drops the prior sender, which terminates the prior observer cleanly.

Build-time / CLI / test builds that never wire a sink retain the None default; the eviction path’s try_send then becomes a no-op short-circuit so there is no measurable cost to leaving the sink unset.

Source

pub fn insert(&self, id: String, embedding: Vec<f32>)

Add a new entry to the index (goes to overflow until next rebuild).

Source

pub fn remove(&self, id: &str)

Remove an entry by ID (marks for exclusion; cleaned up on rebuild).

Source

pub fn search(&self, query: &[f32], k: usize) -> Vec<VectorHit>

Search for the k nearest neighbors to the query embedding.

Combines HNSW approximate search with linear scan of overflow entries. Returns results sorted by ascending distance (closest first).

Source

pub fn len(&self) -> usize

Return the total number of indexed entries (HNSW + overflow).

Source

pub fn is_empty(&self) -> bool

true when the index holds no live entries at all.

#1579 QC — load-bearing for the proactive-conflict dispatch: an EMPTY index is vacuously Self::is_fully_searchable (0 + 0 >= 0), but during the async-boot LOAD phase — after the daemon binds with VectorIndex::empty() and before the boot loader’s seed_entries lands (the get_all_embeddings read is the long pole at 100k rows) — emptiness says nothing about what the DB holds. Callers that would otherwise trust a fully-searchable index (the #519 conflict check) must ALSO require non-emptiness, so that window routes to the bounded recency-scan fallback instead of consulting an index that cannot return anything.

Source

pub fn rebuild(&self)

#968 — Force a full rebuild of the HNSW index from all entries, SYNCHRONOUSLY. Preserved for tests + emergency paths; production code should call Self::rebuild_async so the multi-second graph build does not block the calling thread.

Implementation: delegates to rebuild_async and joins the resulting handle so callers retain the v0.6 semantics (“the graph is rebuilt by the time this returns”). Tests rely on this blocking behavior to assert post-rebuild invariants without adding a yield/poll loop.

Source

pub fn rebuild_async(&self) -> JoinHandle<()>

#968 — Schedule a full HNSW rebuild on a background thread and return the JoinHandle for callers that want to observe completion. The build does NOT hold the inner mutex; readers and writers continue to operate against active + overflow while the new graph warms up. On success, the warmed graph lands in the warming slot and is swapped into active by the next reader/writer (or by the foreground rebuild shim’s post-join try_swap_warming call).

Concurrency contract:

  • At most one rebuild runs at a time (gated by the rebuild_in_flight atomic). A second rebuild_async call while a build is in flight returns a no-op handle (the spawned closure short-circuits if the CAS fails — the in- flight build will pick up the latest entries via the next trigger).
  • Writes during the build flow into overflow and all_entries normally. The swap path uses the snapshot length captured at spawn time to trim only the overflow entries that are now in the new graph; entries inserted AFTER the snapshot remain in overflow for the next cycle.
  • Search is unaffected: it reads active + overflow under the inner mutex, both of which remain coherent throughout.

Failure: a panic inside the build thread is observable via JoinHandle::join(); active is unchanged. The rebuild_in_flight flag is cleared by the RebuildGuard drop-guard whether the build succeeded or panicked.

Source

pub fn try_swap_warming(&self) -> bool

#968 — Swap the warming slot into active if a warmed graph is ready. Called opportunistically from search, insert, and the post-join path of the sync rebuild shim. The swap holds the inner mutex for microseconds — just long enough to std::mem::replace the graph and trim the overflow.

Returns true if a swap occurred, false otherwise. Test code uses the return value to verify the swap landed before asserting post-rebuild state.

Source

pub fn is_fully_searchable(&self) -> bool

#1579 — true when a search against this index can observe every live entry: the active graph (its build-time snapshot length) plus the linearly-scanned overflow cover all_entries. false exactly during the async-boot warm window, when Self::seed_entries has parked DB-loaded entries in all_entries but the background graph build has not swapped in yet — sequenced writes flow through insert() (graph- or overflow-visible) so they never break coverage, and removals/evictions only shrink all_entries (stale graph ids are filtered at search time), so the inequality is conservative in the safe direction.

Consumers: the #519 proactive conflict check routes to its bounded-scan fallback while this is false; the boot loader uses it to decide whether a make-up rebuild is needed after a racing routine rebuild swallowed its CAS.

Source

pub fn seed_entries(&self, entries: Vec<(String, Vec<f32>)>) -> usize

#1579 B3 — bulk-load DB-resident entries into the index WITHOUT building the graph (the async-boot path). Entries land in all_entries only; they become searchable when the follow-up rebuild (see Self::seed_and_rebuild_async) swaps its graph in. Ids already present (e.g. a row written through insert() between the caller’s DB snapshot and this call) are skipped so the index never double-counts. Returns the number of entries actually seeded.

Deliberately does NOT enforce max_entries eviction here — the legacy synchronous boot path (VectorIndex::build over get_all_embeddings) never evicted at boot either, and the first post-boot insert() applies the cap exactly as before.

Source

pub fn seed_and_rebuild_async( &self, entries: Vec<(String, Vec<f32>)>, ) -> JoinHandle<()>

#1579 B3 — async-boot warm-up: seed DB-loaded entries (see Self::seed_entries) and schedule the graph build on the existing #968 double-buffer rebuild machinery. Returns the rebuild thread’s JoinHandle; the caller (the boot loader) joins it off the request path and then calls Self::try_swap_warming + emits the operator-visible “index warm” line.

If a routine rebuild is already in flight (its snapshot predates the seed), rebuild_async returns a no-op handle and the seeded entries stay graph-invisible until a later rebuild; the boot loader detects that via Self::is_fully_searchable and issues a make-up rebuild.

Source

pub fn warm_boot(&self, entries: Vec<(String, Vec<f32>)>) -> usize

#1579 B3 — blocking boot warm-up for callers that hold the index directly (the MCP stdio boot thread; tests). Seeds the DB-loaded entries and drives rebuild→swap to completion, returning the number of entries seeded. Each step takes the inner mutex only briefly (the graph build itself runs on the #968 background thread against a snapshot), so concurrent readers/writers on other threads keep making progress — the CALLING thread is the only one parked.

The retry loop covers the rebuild-CAS race: if a routine 200-overflow rebuild was already in flight when our seed landed, rebuild_async short-circuits to a no-op handle and the in-flight build’s pre-seed snapshot cannot cover the seeded rows — is_fully_searchable stays false and the loop schedules a make-up rebuild once the CAS frees up.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more