Skip to main content

Database

Struct Database 

Source
pub struct Database { /* private fields */ }
Expand description

An in-process Quiver database over one data directory.

Implementations§

Source§

impl Database

Source

pub fn open(dir: &Path) -> Result<Self>

Open (creating if absent) the database at dir with encryption-at-rest disabled, rebuilding each collection’s index from the store.

Source

pub fn open_with_codec(dir: &Path, codec: Box<dyn PageCodec>) -> Result<Self>

Open the database with a specific page codec — used to enable encryption-at-rest by passing quiver-crypto’s AEAD codec. Mirrors quiver_core::Store::open_with_codec; the codec seals both paged files and the WAL, so no plaintext user data reaches the disk.

Source

pub fn open_with_keyring(dir: &Path, keyring: Box<dyn KeyRing>) -> Result<Self>

Open the database with a KeyRing, the seam that lets quiver-crypto’s envelope key-ring seal each collection under its own data-encryption key (enabling crypto-shredding). Mirrors quiver_core::Store::open_with_keyring.

Source

pub fn create_collection( &mut self, name: &str, descriptor: Descriptor, ) -> Result<()>

Create a collection. Errors if the name already exists, or if the index specification is unsupported for the metric.

Source

pub fn drop_collection(&mut self, name: &str) -> Result<bool>

Drop a collection and its data. Returns whether it existed.

Source

pub fn shred_collection(&mut self, name: &str) -> Result<bool>

Crypto-shred a collection: drop it and destroy its data-encryption key, so its sealed data is unrecoverable even with the master key, then reclaim its files. Mirrors quiver_core::Store::shred_collection; with an envelope key-ring this is irreversible erasure, with a single-codec key-ring it is drop plus a checkpoint. Returns whether it existed.

Source

pub fn set_commit_observer(&mut self, observer: CommitObserver)

Install a replication commit observer, invoked with each committed WalEntry in commit order (ADR-0030). The server uses this to drive a leader’s replication stream.

Source

pub fn replication_snapshot(&self) -> Result<Vec<WalOp>>

The operations that recreate the current logical state, for a replication follower to bootstrap from (ADR-0030).

§Errors

Propagates a store read error.

Source

pub fn apply_replicated(&mut self, op: WalOp) -> Result<()>

Apply a replicated operation from a leader (ADR-0030): persist and apply it to the store (preserving the leader’s collection id), then reconcile the in-memory index handles — register a new collection, drop a removed one, or mark a touched collection’s index stale so the next read rebuilds from the replicated state.

§Errors

Propagates a store apply error.

Source

pub fn collection_names(&self) -> Vec<String>

Names of all collections, sorted.

Source

pub fn descriptor(&self, name: &str) -> Option<&Descriptor>

The descriptor of a collection, if it exists.

Source

pub fn len(&self, name: &str) -> Result<usize>

Number of live points in a collection.

Source

pub fn is_empty(&self, name: &str) -> Result<bool>

Whether a collection has no points.

Source

pub fn upsert( &mut self, collection: &str, id: &str, vector: &[f32], payload: &Value, ) -> Result<()>

Insert or replace a point with a JSON payload.

Source

pub fn upsert_batch( &mut self, collection: &str, points: &[(&str, &[f32], &Value)], ) -> Result<u64>

Upsert a batch of points with a single WAL fdatasync (ADR-0038).

points is (id, vector, payload) tuples. The batch is committed atomically — all points or none (from the client’s perspective). This is the preferred path for the REST POST /v1/collections/{c}/points handler which already delivers a batch per HTTP request.

Source

pub fn upsert_bulk( &mut self, collection: &str, points: &[(&str, &[f32], &Value)], ) -> Result<u64>

Upsert a large batch for a bulk load, deferring all index work to a single rebuild pass (ADR-0045).

Like upsert_batch the points are committed with one WAL fdatasync, but instead of folding each point into the in-memory index one at a time, the collection’s index is marked stale so the next search rebuilds it in a single pass over the whole collection — far cheaper for a fresh load (one k-means for IVF, one graph build for Vamana, one inverted-index scan) than N incremental inserts. Prefer upsert_batch for steady-state writes where query-after-write latency matters.

Source

pub fn delete(&mut self, collection: &str, id: &str) -> Result<bool>

Delete a point by id. Returns whether it existed.

Source

pub fn get(&self, collection: &str, id: &str) -> Result<Option<Match>>

Fetch a single point by id, with its payload and vector.

Source

pub fn fetch( &self, collection: &str, filter: Option<&Filter>, limit: usize, with_payload: bool, with_vector: bool, ) -> Result<Vec<Match>>

Fetch points without ranking — an optional cleartext payload filter narrows the set and limit bounds it. This is the retrieval path for a client-side-encrypted collection (ADR-0032): the server returns the entitled set (each point’s payload carries the sealed vector blob under the reserved __quiver_vec__ key) and the client decrypts and ranks locally. It also serves as a general “list points” primitive for any single-vector collection.

Results come in the store’s scan order, not by relevance; the filter is re-checked exactly against each candidate (a selective filter could use the secondary index in future — today it scans).

§Errors

Errors if the collection does not exist or is multi-vector.

Source

pub fn ensure_indexed(&mut self, collection: &str) -> Result<()>

Rebuild a collection’s index if a prior write deferred it (the stale flag), making the collection’s read snapshot current. Idempotent and cheap when already fresh. Separating this &mut self maintenance from the &self *_snapshot reads is what lets a server serve concurrent reads behind a shared lock and take the exclusive lock only for the rare rebuild (ADR-0057).

Source

pub fn set_mvcc_reads(&mut self, on: bool)

Enable or disable lock-free MVCC reads (ADR-0064) at runtime — the server sets this from QUIVER_MVCC_READS. Default off: the proven RwLock read path stays the default until MVCC is loom- and benchmark-validated (increments 2–3). Applies to every loaded collection; a collection’s first rebuild after enabling publishes its base snapshot.

Source

pub fn mvcc_reads(&self) -> bool

Whether lock-free MVCC reads are enabled (ADR-0064).

Source

pub fn collection_snapshot(&self, collection: &str) -> Result<SnapshotCell>

The lock-free serving snapshot cell for a collection (ADR-0064), for a reader to load() without taking any lock — the basis of reads that proceed during writes. Only republished for an MVCC-served collection (single-vector, server-searchable, in-memory index, with MVCC enabled); otherwise it stays the initial empty snapshot and callers use the locked Database::search path. The cell outlives &self, so a reader can hold and re-load it.

§Errors

Returns Error::CollectionNotFound if the collection is not loaded.

Source

pub fn mvcc_cell(&self, collection: &str) -> Result<Option<SnapshotCell>>

The lock-free serving snapshot cell only for a collection that is currently MVCC-served (the flag is on and the collection is single-vector, server-searchable, and in-memory); otherwise None. A server caches the returned cell once and load()s it to serve pure-vector reads with no lock (ADR-0064 increment 3) — the cell self-updates as the writer republishes, so it never needs re-fetching under the lock.

§Errors

Returns Error::CollectionNotFound if the collection is not loaded.

Source

pub fn needs_rebuild(&self, collection: &str) -> Result<bool>

Whether a collection’s index is stale — a prior write deferred its rebuild. The server reads this to schedule an off-lock rebuild (ADR-0062) without holding the exclusive lock; embedded callers never need it (the &mut self searches rebuild synchronously via Database::ensure_indexed).

Source

pub fn snapshot_rebuild_inputs( &self, collection: &str, ) -> Result<Option<RebuildInputs>>

Capture everything an off-lock rebuild needs (ADR-0062): under the shared read lock the caller already holds, scan the collection’s live rows and record the write generation. Returns None when the index is already fresh (nothing to rebuild). The expensive build then runs with no lock via RebuildInputs::build, and Database::commit_rebuild installs it.

Source

pub fn commit_rebuild(&mut self, rebuilt: RebuiltIndex) -> Result<bool>

Install an index built off-lock (ADR-0062) under the brief exclusive lock the caller holds. Returns whether the collection is still stale: if a write landed during the build (the write generation advanced), the fresh index is already behind, so it is installed (still newer than the prior snapshot) but the handle stays stale for the next rebuild. A collection dropped or replaced during the build is ignored (Ok(false)) — the build is discarded.

Source

pub fn search( &mut self, collection: &str, query: &[f32], params: &SearchParams, ) -> Result<Vec<Match>>

Search a collection for the nearest points to query, optionally post-filtered by payload predicate.

Source

pub fn search_snapshot( &self, collection: &str, query: &[f32], params: &SearchParams, ) -> Result<Vec<Match>>

Search a collection’s current immutable snapshot for the nearest points to query, optionally post-filtered by payload predicate. Takes &self, so many readers run concurrently. When a prior write deferred this collection’s rebuild, it serves the prior snapshot (a snapshot-isolated, slightly stale read — ADR-0062/0053); the caller schedules a rebuild via Database::needs_rebuild.

Hybrid search (ADR-0043/0046): fuse up to three rankings with Reciprocal Rank Fusion — a dense ANN ranking, a sparse inverted-index dot-product ranking (sparse_query), and a BM25 full-text ranking (text_query, scored over the same inverted index). Any may be None; at least one is required, giving pure dense / sparse / lexical or any blend through the same path. The same payload filter is re-checked on every side, so results stay exact. rrf_k0 is the RRF rank-bias constant (DEFAULT_RRF_K0).

Source

pub fn hybrid_search_snapshot( &self, collection: &str, dense_query: Option<&[f32]>, sparse_query: Option<&SparseVector>, text_query: Option<&str>, params: &SearchParams, rrf_k0: f32, ) -> Result<Vec<Match>>

Hybrid search over the collection’s current immutable snapshot (&self, so readers run concurrently). When a prior write deferred the rebuild, it serves the prior snapshot (snapshot-isolated, slightly stale — ADR-0062/0053); the caller schedules a rebuild via Database::needs_rebuild.

Source

pub fn upsert_document( &mut self, collection: &str, doc_id: &str, vectors: &[Vec<f32>], payload: &Value, ) -> Result<()>

Insert or replace a multi-vector (late-interaction / ColBERT) document: its vectors are stored as a group of token rows and its payload once on the anchor token (ADR-0028). Re-upserting a document first removes the tokens a shorter version would leave behind, so the document is replaced cleanly.

§Errors

Errors if the collection is single-vector, the document has no vectors, a vector’s dimensionality is wrong, or the id contains the reserved separator.

Source

pub fn search_multi_vector( &mut self, collection: &str, query_tokens: &[Vec<f32>], params: &SearchParams, ) -> Result<Vec<DocumentMatch>>

Search a multi-vector collection by a set of query token vectors, ranking documents by MaxSim late interaction (ADR-0028). At or below the exact-scan threshold every document is scored exactly; above it, candidates are generated by nearest-neighbour search over the token pool (recall tuned by ef_search) and re-ranked exactly. An optional filter is applied to each document’s payload, exactly. A document has no single vector, so with_payload returns the anchor payload and with_vector returns the token vectors.

Source

pub fn search_multi_vector_snapshot( &self, collection: &str, query_tokens: &[Vec<f32>], params: &SearchParams, ) -> Result<Vec<DocumentMatch>>

Multi-vector (late-interaction) search over the collection’s current immutable snapshot (&self, so readers run concurrently). A small corpus is scored exactly; a large corpus draws candidates from the ANN index, serving the prior snapshot when a write deferred its rebuild (snapshot-isolated, slightly stale — ADR-0062/0053); the caller schedules the rebuild via Database::needs_rebuild.

Source

pub fn get_document( &self, collection: &str, doc_id: &str, with_vectors: bool, ) -> Result<Option<DocumentMatch>>

Fetch a multi-vector document by id: its anchor payload and, if with_vectors, its token vectors. None if the document does not exist.

Source

pub fn delete_document( &mut self, collection: &str, doc_id: &str, ) -> Result<bool>

Delete a multi-vector document and all of its token rows. Returns whether it existed.

Source

pub fn document_count(&self, collection: &str) -> Result<usize>

The number of documents in a multi-vector collection. Errors if the collection is single-vector.

Source

pub fn checkpoint(&mut self) -> Result<()>

Flush a durable checkpoint of all collections, capturing a durable snapshot of each built, up-to-date IVF index (ADR-0025) so it reloads on open instead of rebuilding. Other index kinds, and a stale or unbuilt IVF, are rebuilt on open.

Source

pub fn compact(&mut self) -> Result<()>

Compact every collection with reclaimable space, merging its sealed segments and dropping deleted/shadowed rows. Crash-safe; a no-op for collections with nothing to reclaim.

Source

pub fn manifest_version(&self) -> u64

The manifest version — the catalog generation a snapshot captures (ADR-0050). Surfaced as snapshot-relevant status in database_stats.

Source

pub fn disk_usage_bytes(&self) -> u64

Best-effort total on-disk size of the data directory, in bytes — what a full snapshot would copy (ADR-0050). Unreadable entries are skipped.

Source

pub fn snapshot(&mut self, dest: &Path) -> Result<SnapshotInfo>

Take a consistent online snapshot of the whole database into dest (which must not already exist), returning what was captured (ADR-0050).

The writer lock is held for the duration: checkpoint seals the active buffer into segments and advances the WAL floor to the head, then the data directory is byte-copied. Opening dest afterwards replays an empty WAL tail and reconstructs the database exactly as of this call.

§Errors

Error::Core if dest already exists, or on any I/O error during the checkpoint or the copy.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.