pub struct Database { /* private fields */ }Expand description
An in-process Quiver database over one data directory.
Implementations§
Source§impl Database
impl Database
Sourcepub fn open(dir: &Path) -> Result<Self>
pub fn open(dir: &Path) -> Result<Self>
Open (creating if absent) the database at dir with encryption-at-rest
disabled, rebuilding each collection’s index from the store.
Sourcepub fn open_with_codec(dir: &Path, codec: Box<dyn PageCodec>) -> Result<Self>
pub fn open_with_codec(dir: &Path, codec: Box<dyn PageCodec>) -> Result<Self>
Open the database with a specific page codec — used to enable
encryption-at-rest by passing quiver-crypto’s AEAD codec. Mirrors
quiver_core::Store::open_with_codec; the codec seals both paged files
and the WAL, so no plaintext user data reaches the disk.
Sourcepub fn open_with_keyring(dir: &Path, keyring: Box<dyn KeyRing>) -> Result<Self>
pub fn open_with_keyring(dir: &Path, keyring: Box<dyn KeyRing>) -> Result<Self>
Open the database with a KeyRing, the seam that lets quiver-crypto’s
envelope key-ring seal each collection under its own data-encryption key
(enabling crypto-shredding). Mirrors
quiver_core::Store::open_with_keyring.
Sourcepub fn create_collection(
&mut self,
name: &str,
descriptor: Descriptor,
) -> Result<()>
pub fn create_collection( &mut self, name: &str, descriptor: Descriptor, ) -> Result<()>
Create a collection. Errors if the name already exists, or if the index specification is unsupported for the metric.
Sourcepub fn drop_collection(&mut self, name: &str) -> Result<bool>
pub fn drop_collection(&mut self, name: &str) -> Result<bool>
Drop a collection and its data. Returns whether it existed.
Sourcepub fn shred_collection(&mut self, name: &str) -> Result<bool>
pub fn shred_collection(&mut self, name: &str) -> Result<bool>
Crypto-shred a collection: drop it and destroy its data-encryption key, so
its sealed data is unrecoverable even with the master key, then reclaim
its files. Mirrors quiver_core::Store::shred_collection; with an
envelope key-ring this is irreversible erasure, with a single-codec
key-ring it is drop plus a checkpoint. Returns whether it existed.
Sourcepub fn set_commit_observer(&mut self, observer: CommitObserver)
pub fn set_commit_observer(&mut self, observer: CommitObserver)
Install a replication commit observer, invoked with each committed
WalEntry in commit order (ADR-0030). The server uses this to drive a
leader’s replication stream.
Sourcepub fn replication_snapshot(&self) -> Result<Vec<WalOp>>
pub fn replication_snapshot(&self) -> Result<Vec<WalOp>>
The operations that recreate the current logical state, for a replication follower to bootstrap from (ADR-0030).
§Errors
Propagates a store read error.
Sourcepub fn apply_replicated(&mut self, op: WalOp) -> Result<()>
pub fn apply_replicated(&mut self, op: WalOp) -> Result<()>
Apply a replicated operation from a leader (ADR-0030): persist and apply it to the store (preserving the leader’s collection id), then reconcile the in-memory index handles — register a new collection, drop a removed one, or mark a touched collection’s index stale so the next read rebuilds from the replicated state.
§Errors
Propagates a store apply error.
Sourcepub fn collection_names(&self) -> Vec<String>
pub fn collection_names(&self) -> Vec<String>
Names of all collections, sorted.
Sourcepub fn descriptor(&self, name: &str) -> Option<&Descriptor>
pub fn descriptor(&self, name: &str) -> Option<&Descriptor>
The descriptor of a collection, if it exists.
Sourcepub fn upsert(
&mut self,
collection: &str,
id: &str,
vector: &[f32],
payload: &Value,
) -> Result<()>
pub fn upsert( &mut self, collection: &str, id: &str, vector: &[f32], payload: &Value, ) -> Result<()>
Insert or replace a point with a JSON payload.
Sourcepub fn upsert_batch(
&mut self,
collection: &str,
points: &[(&str, &[f32], &Value)],
) -> Result<u64>
pub fn upsert_batch( &mut self, collection: &str, points: &[(&str, &[f32], &Value)], ) -> Result<u64>
Upsert a batch of points with a single WAL fdatasync (ADR-0038).
points is (id, vector, payload) tuples. The batch is committed
atomically — all points or none (from the client’s perspective). This
is the preferred path for the REST POST /v1/collections/{c}/points
handler which already delivers a batch per HTTP request.
Sourcepub fn upsert_bulk(
&mut self,
collection: &str,
points: &[(&str, &[f32], &Value)],
) -> Result<u64>
pub fn upsert_bulk( &mut self, collection: &str, points: &[(&str, &[f32], &Value)], ) -> Result<u64>
Upsert a large batch for a bulk load, deferring all index work to a single rebuild pass (ADR-0045).
Like upsert_batch the points are committed with one
WAL fdatasync, but instead of folding each point into the in-memory index
one at a time, the collection’s index is marked stale so the next search
rebuilds it in a single pass over the whole collection — far cheaper for a
fresh load (one k-means for IVF, one graph build for Vamana, one inverted-index
scan) than N incremental inserts. Prefer upsert_batch for steady-state
writes where query-after-write latency matters.
Sourcepub fn delete(&mut self, collection: &str, id: &str) -> Result<bool>
pub fn delete(&mut self, collection: &str, id: &str) -> Result<bool>
Delete a point by id. Returns whether it existed.
Sourcepub fn get(&self, collection: &str, id: &str) -> Result<Option<Match>>
pub fn get(&self, collection: &str, id: &str) -> Result<Option<Match>>
Fetch a single point by id, with its payload and vector.
Sourcepub fn fetch(
&self,
collection: &str,
filter: Option<&Filter>,
limit: usize,
with_payload: bool,
with_vector: bool,
) -> Result<Vec<Match>>
pub fn fetch( &self, collection: &str, filter: Option<&Filter>, limit: usize, with_payload: bool, with_vector: bool, ) -> Result<Vec<Match>>
Fetch points without ranking — an optional cleartext payload filter
narrows the set and limit bounds it. This is the retrieval path for a
client-side-encrypted collection (ADR-0032): the server returns the entitled
set (each point’s payload carries the sealed vector blob under the reserved
__quiver_vec__ key) and the client decrypts and ranks locally. It also
serves as a general “list points” primitive for any single-vector collection.
Results come in the store’s scan order, not by relevance; the filter is re-checked exactly against each candidate (a selective filter could use the secondary index in future — today it scans).
§Errors
Errors if the collection does not exist or is multi-vector.
Sourcepub fn ensure_indexed(&mut self, collection: &str) -> Result<()>
pub fn ensure_indexed(&mut self, collection: &str) -> Result<()>
Rebuild a collection’s index if a prior write deferred it (the stale
flag), making the collection’s read snapshot current. Idempotent and cheap
when already fresh. Separating this &mut self maintenance from the &self
*_snapshot reads is what lets a server serve concurrent reads behind a
shared lock and take the exclusive lock only for the rare rebuild (ADR-0057).
Sourcepub fn set_mvcc_reads(&mut self, on: bool)
pub fn set_mvcc_reads(&mut self, on: bool)
Enable or disable lock-free MVCC reads (ADR-0064) at runtime — the server
sets this from QUIVER_MVCC_READS. Default off: the proven RwLock read path
stays the default until MVCC is loom- and benchmark-validated (increments
2–3). Applies to every loaded collection; a collection’s first rebuild after
enabling publishes its base snapshot.
Sourcepub fn mvcc_reads(&self) -> bool
pub fn mvcc_reads(&self) -> bool
Whether lock-free MVCC reads are enabled (ADR-0064).
Sourcepub fn collection_snapshot(&self, collection: &str) -> Result<SnapshotCell>
pub fn collection_snapshot(&self, collection: &str) -> Result<SnapshotCell>
The lock-free serving snapshot cell for a collection (ADR-0064), for a
reader to load() without taking any lock — the basis of reads that proceed
during writes. Only republished for an MVCC-served collection (single-vector,
server-searchable, in-memory index, with MVCC enabled); otherwise it stays
the initial empty snapshot and callers use the locked Database::search
path. The cell outlives &self, so a reader can hold and re-load it.
§Errors
Returns Error::CollectionNotFound if the collection is not loaded.
Sourcepub fn mvcc_cell(&self, collection: &str) -> Result<Option<SnapshotCell>>
pub fn mvcc_cell(&self, collection: &str) -> Result<Option<SnapshotCell>>
The lock-free serving snapshot cell only for a collection that is
currently MVCC-served (the flag is on and the collection is single-vector,
server-searchable, and in-memory); otherwise None. A server caches the
returned cell once and load()s it to serve pure-vector reads with no lock
(ADR-0064 increment 3) — the cell self-updates as the writer republishes, so
it never needs re-fetching under the lock.
§Errors
Returns Error::CollectionNotFound if the collection is not loaded.
Sourcepub fn needs_rebuild(&self, collection: &str) -> Result<bool>
pub fn needs_rebuild(&self, collection: &str) -> Result<bool>
Whether a collection’s index is stale — a prior write deferred its rebuild.
The server reads this to schedule an off-lock rebuild (ADR-0062) without
holding the exclusive lock; embedded callers never need it (the &mut self
searches rebuild synchronously via Database::ensure_indexed).
Sourcepub fn snapshot_rebuild_inputs(
&self,
collection: &str,
) -> Result<Option<RebuildInputs>>
pub fn snapshot_rebuild_inputs( &self, collection: &str, ) -> Result<Option<RebuildInputs>>
Capture everything an off-lock rebuild needs (ADR-0062): under the shared
read lock the caller already holds, scan the collection’s live rows and
record the write generation. Returns None when the index is already fresh
(nothing to rebuild). The expensive build then runs with no lock via
RebuildInputs::build, and Database::commit_rebuild installs it.
Sourcepub fn commit_rebuild(&mut self, rebuilt: RebuiltIndex) -> Result<bool>
pub fn commit_rebuild(&mut self, rebuilt: RebuiltIndex) -> Result<bool>
Install an index built off-lock (ADR-0062) under the brief exclusive lock the
caller holds. Returns whether the collection is still stale: if a write
landed during the build (the write generation advanced), the fresh index is
already behind, so it is installed (still newer than the prior snapshot) but
the handle stays stale for the next rebuild. A collection dropped or replaced
during the build is ignored (Ok(false)) — the build is discarded.
Sourcepub fn search(
&mut self,
collection: &str,
query: &[f32],
params: &SearchParams,
) -> Result<Vec<Match>>
pub fn search( &mut self, collection: &str, query: &[f32], params: &SearchParams, ) -> Result<Vec<Match>>
Search a collection for the nearest points to query, optionally
post-filtered by payload predicate.
Sourcepub fn search_snapshot(
&self,
collection: &str,
query: &[f32],
params: &SearchParams,
) -> Result<Vec<Match>>
pub fn search_snapshot( &self, collection: &str, query: &[f32], params: &SearchParams, ) -> Result<Vec<Match>>
Search a collection’s current immutable snapshot for the nearest points
to query, optionally post-filtered by payload predicate. Takes &self, so
many readers run concurrently. When a prior write deferred this collection’s
rebuild, it serves the prior snapshot (a snapshot-isolated, slightly stale
read — ADR-0062/0053); the caller schedules a rebuild via
Database::needs_rebuild.
Sourcepub fn hybrid_search(
&mut self,
collection: &str,
dense_query: Option<&[f32]>,
sparse_query: Option<&SparseVector>,
text_query: Option<&str>,
params: &SearchParams,
rrf_k0: f32,
) -> Result<Vec<Match>>
pub fn hybrid_search( &mut self, collection: &str, dense_query: Option<&[f32]>, sparse_query: Option<&SparseVector>, text_query: Option<&str>, params: &SearchParams, rrf_k0: f32, ) -> Result<Vec<Match>>
Hybrid search (ADR-0043/0046): fuse up to three rankings with Reciprocal
Rank Fusion — a dense ANN ranking, a sparse inverted-index dot-product
ranking (sparse_query), and a BM25 full-text ranking (text_query, scored
over the same inverted index). Any may be None; at least one is required,
giving pure dense / sparse / lexical or any blend through the same path. The
same payload filter is re-checked on every side, so results stay exact.
rrf_k0 is the RRF rank-bias constant (DEFAULT_RRF_K0).
Sourcepub fn hybrid_search_snapshot(
&self,
collection: &str,
dense_query: Option<&[f32]>,
sparse_query: Option<&SparseVector>,
text_query: Option<&str>,
params: &SearchParams,
rrf_k0: f32,
) -> Result<Vec<Match>>
pub fn hybrid_search_snapshot( &self, collection: &str, dense_query: Option<&[f32]>, sparse_query: Option<&SparseVector>, text_query: Option<&str>, params: &SearchParams, rrf_k0: f32, ) -> Result<Vec<Match>>
Hybrid search over the collection’s current immutable snapshot (&self, so
readers run concurrently). When a prior write deferred the rebuild, it serves
the prior snapshot (snapshot-isolated, slightly stale — ADR-0062/0053);
the caller schedules a rebuild via Database::needs_rebuild.
Sourcepub fn upsert_document(
&mut self,
collection: &str,
doc_id: &str,
vectors: &[Vec<f32>],
payload: &Value,
) -> Result<()>
pub fn upsert_document( &mut self, collection: &str, doc_id: &str, vectors: &[Vec<f32>], payload: &Value, ) -> Result<()>
Insert or replace a multi-vector (late-interaction / ColBERT) document: its
vectors are stored as a group of token rows and its payload once on the
anchor token (ADR-0028). Re-upserting a document first removes the tokens a
shorter version would leave behind, so the document is replaced cleanly.
§Errors
Errors if the collection is single-vector, the document has no vectors, a vector’s dimensionality is wrong, or the id contains the reserved separator.
Sourcepub fn search_multi_vector(
&mut self,
collection: &str,
query_tokens: &[Vec<f32>],
params: &SearchParams,
) -> Result<Vec<DocumentMatch>>
pub fn search_multi_vector( &mut self, collection: &str, query_tokens: &[Vec<f32>], params: &SearchParams, ) -> Result<Vec<DocumentMatch>>
Search a multi-vector collection by a set of query token vectors, ranking
documents by MaxSim late interaction (ADR-0028). At or below the exact-scan
threshold every document is scored exactly; above it, candidates are
generated by nearest-neighbour search over the token pool (recall tuned by
ef_search) and re-ranked exactly. An optional filter is applied to each
document’s payload, exactly. A document has no single vector, so with_payload
returns the anchor payload and with_vector returns the token vectors.
Sourcepub fn search_multi_vector_snapshot(
&self,
collection: &str,
query_tokens: &[Vec<f32>],
params: &SearchParams,
) -> Result<Vec<DocumentMatch>>
pub fn search_multi_vector_snapshot( &self, collection: &str, query_tokens: &[Vec<f32>], params: &SearchParams, ) -> Result<Vec<DocumentMatch>>
Multi-vector (late-interaction) search over the collection’s current
immutable snapshot (&self, so readers run concurrently). A small corpus is
scored exactly; a large corpus draws candidates from the ANN index, serving
the prior snapshot when a write deferred its rebuild (snapshot-isolated,
slightly stale — ADR-0062/0053); the caller schedules the rebuild via
Database::needs_rebuild.
Sourcepub fn get_document(
&self,
collection: &str,
doc_id: &str,
with_vectors: bool,
) -> Result<Option<DocumentMatch>>
pub fn get_document( &self, collection: &str, doc_id: &str, with_vectors: bool, ) -> Result<Option<DocumentMatch>>
Fetch a multi-vector document by id: its anchor payload and, if
with_vectors, its token vectors. None if the document does not exist.
Sourcepub fn delete_document(
&mut self,
collection: &str,
doc_id: &str,
) -> Result<bool>
pub fn delete_document( &mut self, collection: &str, doc_id: &str, ) -> Result<bool>
Delete a multi-vector document and all of its token rows. Returns whether it existed.
Sourcepub fn document_count(&self, collection: &str) -> Result<usize>
pub fn document_count(&self, collection: &str) -> Result<usize>
The number of documents in a multi-vector collection. Errors if the collection is single-vector.
Sourcepub fn checkpoint(&mut self) -> Result<()>
pub fn checkpoint(&mut self) -> Result<()>
Flush a durable checkpoint of all collections, capturing a durable snapshot of each built, up-to-date IVF index (ADR-0025) so it reloads on open instead of rebuilding. Other index kinds, and a stale or unbuilt IVF, are rebuilt on open.
Sourcepub fn compact(&mut self) -> Result<()>
pub fn compact(&mut self) -> Result<()>
Compact every collection with reclaimable space, merging its sealed segments and dropping deleted/shadowed rows. Crash-safe; a no-op for collections with nothing to reclaim.
Sourcepub fn manifest_version(&self) -> u64
pub fn manifest_version(&self) -> u64
The manifest version — the catalog generation a snapshot captures
(ADR-0050). Surfaced as snapshot-relevant status in database_stats.
Sourcepub fn disk_usage_bytes(&self) -> u64
pub fn disk_usage_bytes(&self) -> u64
Best-effort total on-disk size of the data directory, in bytes — what a full snapshot would copy (ADR-0050). Unreadable entries are skipped.
Sourcepub fn snapshot(&mut self, dest: &Path) -> Result<SnapshotInfo>
pub fn snapshot(&mut self, dest: &Path) -> Result<SnapshotInfo>
Take a consistent online snapshot of the whole database into dest
(which must not already exist), returning what was captured (ADR-0050).
The writer lock is held for the duration: checkpoint seals the active
buffer into segments and advances the WAL floor to the head, then the
data directory is byte-copied. Opening dest afterwards replays an empty
WAL tail and reconstructs the database exactly as of this call.
§Errors
Error::Core if dest already exists, or on any I/O error during the
checkpoint or the copy.