Expand description
The embeddable, in-process Quiver database handle.
Database composes the storage engine (quiver_core::Store) with a
per-collection vector index and payload filtering (quiver_query::Filter)
into one handle. It exposes the same logical operations the server speaks
(docs/api/wire-protocol.md), so library mode and server mode exercise
identical engine semantics — the server is a thin transport/policy shell.
§Index lifecycle
The store is the source of truth. Each collection chooses its index via the
descriptor’s IndexSpec (default in-memory HNSW); the index is built from
the store on open. HNSW applies new-id inserts incrementally; once an IVF
index is built it applies inserts, in-place updates, and deletes
incrementally with LIRE rebalancing (ADR-0023). The Vamana / disk graph
family is maintained the FreshDiskANN way (ADR-0033): the batch-built graph
is a read-only base, recent inserts land in an in-memory delta graph, and
deletes are tombstoned, so writes are size-independent; when the pending work
grows past a fixed fraction of the base the next access consolidates by
rebuilding from the store. All indexes stay derived (rebuilt from the store
on open), so the crash gate never sees an index write.
§Filtered (hybrid) search
A search may carry a quiver_query::Filter over the payload. The planner
decomposes it into the predicates the collection’s secondary indexes can
answer; when those narrow the query to a small candidate set it scans that
set exactly (perfect recall, no filtered-ANN cliff), and otherwise it
over-fetches from the ANN index and post-filters. Both arms re-check the full
filter, so results are exact regardless of which path runs.
§Concurrency (ADR-0057 / ADR-0062)
Single-writer. Writes take &mut self. Reads come in two flavors: the
&mut self convenience methods (search, hybrid_search,
search_multi_vector) rebuild a stale index in place and so give embedded,
single-threaded callers read-your-writes; the &self *_snapshot methods
read the current immutable snapshot and run concurrently, serving the
prior snapshot when a write deferred a rebuild (snapshot-isolated, slightly
stale). A server therefore serves concurrent reads behind a reader–writer lock,
and rebuilds off the exclusive lock (ADR-0062): it captures the rebuild
inputs under the shared lock (Database::snapshot_rebuild_inputs), builds the
new index with no lock held (RebuildInputs::build), and installs it under a
brief write lock (Database::commit_rebuild) — so a rebuild never stalls
concurrent readers.
Structs§
- Collection
Id - A collection identifier, assigned monotonically by the catalog and stable for the life of the collection.
- Collection
Snapshot - An immutable, lock-free-readable view of a single-vector collection (ADR-0064):
the base index as of the last rebuild, the base id map, and the overlay of
writes since. Obtained via
Database::collection_snapshotand read withCollectionSnapshot::search; a read is snapshot-isolated — it sees one consistent(base, overlay)pair, and a write that lands mid-read is simply the next snapshot. - Database
- An in-process Quiver database over one data directory.
- Descriptor
- The immutable schema of a collection, fixed at creation.
- Document
Match - A multi-vector (late-interaction / ColBERT) document result: a document id, its MaxSim relevance, the payload, and — if requested — the document’s token vectors (ADR-0028).
- Filterable
Field - A payload field declared filterable at collection creation: its dot-path and type. Declared fields are extracted into the per-segment secondary index at flush time (ADR-0022), enabling pre-filtered (hybrid) search.
- Index
Spec - Which index a collection uses and how its vectors are compressed (ADR-0007, ADR-0008). Defaults to in-memory HNSW with no quantization (exact search).
- Match
- A single search or fetch result.
- Rebuild
Inputs - A captured, owned snapshot of everything an off-lock rebuild needs (ADR-0062):
the scanned rows, the collection’s descriptor, and the write generation at
capture time. Produced under the shared read lock by
Database::snapshot_rebuild_inputs;RebuildInputs::buildthen constructs the new index with no lock held. - Rebuilt
Index - A new index built off-lock from a
RebuildInputs, ready forDatabase::commit_rebuildto install under the brief write lock (ADR-0062). - Search
Params - Parameters for a
Database::search. - Single
Codec KeyRing - A
KeyRingthat seals everything — catalog and every collection — with one shared codec. - Snapshot
Info - What a
Database::snapshotcaptured (ADR-0050): the catalog generation and the number of files / bytes copied. - Sparse
Inverted Index - An in-memory inverted index over sparse vectors (ADR-0045).
- Sparse
Vector - A sparse vector: parallel
indicesandvalues. Indices are dimension ids into a (possibly very large) sparse vocabulary; values are their weights. - WalEntry
- A WAL record: a monotonic LSN paired with the operation it commits.
Enums§
- Distance
Metric - The distance / similarity function a collection is searched with.
- Dtype
- The element type of stored vectors. Phase 1 ships
f32; lower-precision and quantized dtypes arrive with the memory-frugality work in Phase 2. - Error
- Errors returned by the embeddable database.
- Field
Type - The type of a filterable payload field, which fixes how its values are keyed
in the secondary index (
.sec) — and therefore which predicates it answers. - Filter
- A predicate over a point’s JSON payload.
- Index
Kind - The index structure a collection is served by (ADR-0007). The default is the in-memory HNSW graph; the others are the Phase 2 memory-frugal options.
- Vector
Encryption - How a collection’s vectors are encrypted (ADR-0031, ADR-0032). Encryption is
always client-side — the server never holds the key. Defaults to
VectorEncryption::None. The variants sit on Quiver’s encrypted-search spectrum, from fastest to most confidential: - WalOp
- A single logical mutation recorded in the WAL.
Constants§
- BM25_B
- The conventional BM25 length-normalization parameter.
- BM25_K1
- The conventional BM25 term-frequency saturation parameter (Robertson et al.).
- DEFAULT_
RRF_ K0 - The conventional RRF rank-bias constant (Cormack et al., 2009).
- SPARSE_
KEY - The reserved payload key carrying a point’s sparse vector (ADR-0043).
- TEXT_
KEY - The reserved payload key carrying a point’s full-text field (ADR-0046). When a
point has no explicit
__quiver_sparse__vector but carries a string under this key, the engine tokenizes it into a term-frequency sparse vector at ingest, so the point is searchable by BM25 over text alone.
Traits§
- KeyRing
- Supplies the page codecs the storage engine seals data with, and manages the per-collection key lifecycle that crypto-shredding relies on.
- Page
Codec - Transforms Quiver’s durable bytes — fixed-size pages and variable-length records — to and from their on-disk representation.
Functions§
- query_
term_ ids - Tokenize
textinto the de-duplicated query term ids BM25 scores against (a repeated query term counts once). The query side of the BM25 path (ADR-0046). - restore_
snapshot - Restore a snapshot directory
src(produced byDatabase::snapshot) into a freshdestdirectory, leaving it ready for the caller to open with the same keyring/codec the snapshot was written under (ADR-0050). - rrf_
fuse - Fuse several ranked id lists by Reciprocal Rank Fusion and return the top
top_kids with their fused scores, highest first. - text_
to_ sparse - Tokenize
textinto a term-frequencySparseVector: dimension ids are token ids (term_id) and values are within-text term counts. The ingest side of the BM25 path (ADR-0046).
Type Aliases§
- Commit
Observer - A synchronous hook invoked with each committed
WalEntry, in commit order. Leader-follower replication (ADR-0030) installs one to publish each op to its replication stream. A plainFnkeeps the engine runtime-agnostic — no async dependency leaks intoquiver-core. - Result
- Result alias for database operations.
- Snapshot
Cell - Per-collection lock-free serving snapshot pointer: the single writer
stores a newCollectionSnapshot; readersloadone without a lock. (ArcSwap<T>stores anArc<T>internally, so this is oneArcper load.)