Struct Collection

Source

pub struct Collection<'tx, T: Document> { /* private fields */ }

Expand description

Typed handle to a collection.

Construct via crate::WriteTxn::collection (lazy-create) or crate::ReadTxn::collection (read-only; errors if absent), or via crate::Db::collection for a one-shot read-only handle bound to a runtime collection name (M11 #94 — Phase 1B).

All methods take &self because the underlying state lives behind mutexes on the parent transaction; the handle itself is stateless beyond the descriptor it caches.

Implementations§

Source §

impl<'tx, T: Document> Collection<'tx, T>

Source

pub fn descriptor(&self) -> &CollectionDescriptor

Cached descriptor (collection_id, primary_root, type_version, next_id at handle-open time).

Source

pub fn insert(&self, doc: T) -> Result<Id>

Insert doc. Returns the freshly-allocated Id.

§Errors

Error::ReadOnly if the handle is read-only.
Pager / catalog / codec errors propagated.

Source

pub fn get(&self, id: Id) -> Result<Option<T>>

Fetch the document at id.

On the write side this consults the pager (sees pending writes in the current txn). On the read side it consults the snapshot’s frozen view.

§Lazy migration (M10 #84)

If the on-disk record was written by an older Document::VERSION than the current T::VERSION, the codec walks the stored bytes through the schema registered for that version (see T::historical_schemas()) and dispatches the resulting structured Dynamic through T::migrate. The migrated bytes are NOT written back to disk. The next Collection::get re-reads the same v(n) bytes and re-runs migration. Only a subsequent Collection::update / Collection::upsert writes the document back, at which point the on-disk header records T::VERSION.

This contract is what allows mixed-version reads to scale: a 10⁹-doc collection does not need to be batch-rewritten on schema upgrade. Power-of-ten Rule 7: every “migration ran” path returns the migrated T; no implicit write-back.

§Errors

Pager / B-tree / codec errors propagated. In particular:

Error::SchemaNotRegistered if the stored record carries a type_version for which T::historical_schemas() has no entry.
Error::SchemaMigrationNotImplemented if the registered T::migrate returns the default error.

Source

pub fn update<F>(&self, id: Id, f: F) -> Result<()>
where F: FnOnce(&mut T),

Apply f to the document at id, writing the mutated value back.

§Errors

Error::ReadOnly on a read-side handle.
Error::DocumentNotFound if id is absent.
Pager / catalog / codec errors propagated.

Source

pub fn delete(&self, id: Id) -> Result<bool>

Delete the document at id. Returns true if it existed.

§Errors

Error::ReadOnly on a read-side handle.
Pager / catalog errors propagated.

Source

pub fn upsert(&self, id: Id, doc: T) -> Result<()>

Insert-or-replace doc at id.

§Errors

Error::ReadOnly on a read-side handle.
Pager / catalog / codec errors propagated.

Source

pub fn find_unique( &self, index_name: &str, key: impl Into<Dynamic>, ) -> Result<Option<T>>

Look up the single document whose index_name key matches key under a Unique index.

Errors with Error::IndexNotUnique if index_name resolves to a non-unique index — find_unique is only defined on Unique indexes. For Standard / Each / Composite use Self::lookup (which returns an iterator).

Snapshot-aware: on a write-side handle the lookup sees the current txn’s pending writes; on a read-side handle it sees the snapshot’s frozen view.

§Errors

Error::IndexNotFound if index_name is unknown / dropped.
Error::IndexNotUnique if the index is not Unique.
Pager / B-tree / codec errors propagated.

Source

pub fn lookup( &self, index_name: &str, key: impl Into<Dynamic>, ) -> Result<Box<dyn Iterator<Item = Result<T>> + Send + 'static>>
where T: Send + 'static,

Yield every document whose index_name key matches key. Works on Standard / Unique / Each indexes. Returns Err(Error::IndexKindMismatch)-style guidance for Composite (use Self::index_range for tuple-shaped keys).

The same document is yielded at most once even if it owns multiple matching entries — Each indexes can encode the same id under multiple element keys; we de-dup on emit.

§Errors

Error::IndexNotFound if index_name is unknown / dropped.
Pager / B-tree / codec errors propagated.

Source

pub fn index_range<R>( &self, index_name: &str, range: R, ) -> Result<Box<dyn Iterator<Item = Result<(Vec<u8>, T)>> + Send + 'static>>
where R: RangeBounds<Dynamic>, T: Send + 'static,

Yield (user_key, doc) pairs whose index key falls within range. The bounds are Dynamic values — the same ergonomic type crate::Query::index_range takes — encoded internally through the order-preserving field encoder (obj_core::index::encode_field); callers no longer hand-encode index-key bytes.

For non-Unique kinds (Standard / Each / Composite) the bounds are widened internally so a user-facing Included(x)..=Included(x) range matches every entry whose user-key equals x even though the underlying B-tree key carries an id_be8 suffix (see docs/format.md § Index key encoding § Range-bound widening (non-Unique kinds)).

§Errors

Error::IndexNotFound if index_name is unknown / dropped.
obj_core::Error::Codec if a Dynamic::String bound carries an embedded NUL byte (the order-preserving encoder rejects those).
Pager / B-tree / codec errors propagated.

Source

pub fn iter_range<'a, R>( &'a self, index_name: &str, range: R, ) -> Result<IterIndexRange<'a, T>>
where R: RangeBounds<Dynamic>, T: Send + 'static,

Streaming variant of Self::index_range (Phase 7A perf pass, M14 #14). Yields (user_key, T) pairs lazily — the returned IterIndexRange decodes one T per next() call rather than building a Vec<Result<(_, T)>> of every match up front. The iterator borrows &'a self, so it must be consumed inside the lifetime of the enclosing crate::WriteTxn / crate::ReadTxn (or the crate::Db::collection handle, in Lazy mode).

§When to prefer `iter_range` over `index_range`

Memory. index_range allocates O(matches × sizeof(T)) upfront; iter_range keeps a fixed-size VecDeque of (key, id) pairs (ITER_INDEX_RANGE_BATCH = 256 entries) and decodes one T at a time. For a 100k-row range with ~500-byte documents that’s ~50 MB peak vs. a few KiB.
Latency-to-first-row. index_range decodes every matching document before returning the iterator; iter_range returns immediately after the first chunk refill, so the first next() returns after one index walk
- one primary-tree get (rather than N).

§When `index_range` is still the right answer

index_range returns an IndexIter<'static, _> — it can escape the read_transaction / transaction closure that produced it. iter_range is bound to &self, so the iterator dies when the Collection handle dies. If you need to return the iterator to outer scope, stick with index_range.

§Per-row `get`-back design choice

Each next() yields (user_key, T) by calling Self::get under the hood — i.e. a SECOND B+tree descent per row (the first is the index range walk; the second is the primary-tree get(id)). This is intentional and inherited from index_range: the index leaf stores only the document id (8 bytes), not the document bytes. A future format-minor bump may add value-in-index storage to short-circuit the second descent; that work is pinned to post-1.0 (tracked as pit issue #16, “value-in-index storage to eliminate index_range double-decode”).

§Errors

Error::IndexNotFound if index_name is unknown / dropped.
Pager / B-tree / codec errors propagated at construction and from each next() call.

Source

pub fn count_all(&self) -> Result<u64>

Count every entry in the primary tree WITHOUT decoding the documents. Used by the M8 crate::Query::count no-decode fast path; the iterator visits leaf pages and counts entries rather than running each through postcard.

Power-of-ten Rule 2: bounded by the B+tree’s MAX_RANGE_NODES budget (inherited from BTree::range).

§Errors

Pager / B-tree errors propagated.

Source

pub fn count_index_range<R>(&self, index_name: &str, range: R) -> Result<u64>
where R: RangeBounds<Dynamic>,

Count every entry whose encoded key falls inside range on the named index’s B-tree, WITHOUT decoding any document. M8 fast path for crate::Query::count when the source is an index_range.

Returns the number of index B-tree entries — for an Each index that may exceed the document count (one doc emits multiple entries); for other kinds it equals the matching doc count.

§Errors

Error::IndexNotFound if index_name is unknown / dropped.
Pager / B-tree errors propagated.

Source

pub fn count_distinct_ids_in_range<R>( &self, index_name: &str, range: R, ) -> Result<u64>
where R: RangeBounds<Dynamic>,

Count distinct document Ids whose entries fall inside range on the named index’s B-tree, WITHOUT decoding any document. For Each indexes this is the correct shape of the “how many docs match” question — count_index_range returns the entry count, which overshoots when a single doc contributes multiple entries.

Implementation walks the index B-tree, parses the trailing 8-byte big-endian Id suffix from each non-unique key, and tracks the unique set in a bounded std::collections::HashSet capped at MAX_DISTINCT_IDS. Exceeding the cap surfaces Error::DistinctCountExceeded — the caller should narrow the range.

§Per-kind semantics

Standard, Composite: equivalent to count_index_range (one entry per doc by construction; the trailing-id-suffix walk still produces the same total).
Unique: keys carry NO id suffix — the entry value is the raw 8-byte Id; the walk reads the value instead.
Each: the dedup is meaningful — one doc may contribute N entries under N distinct element keys.

§Errors

Error::IndexNotFound if index_name is unknown / dropped.
Error::DistinctCountExceeded if the distinct set exceeds MAX_DISTINCT_IDS.
Error::Corruption if an entry’s id suffix / value is not parseable as an obj_core::Id.
Pager / B-tree errors propagated.

Source

pub fn all(&self) -> Result<Vec<(Id, T)>>

Materialise every (Id, T) pair in the collection.

Implementation note: M6 returns an owned Vec rather than a streaming iterator because the B+tree range API borrows the pager, and threading that borrow through the mutex guards in the iterator chain is awkward. M7+ may convert to a streaming shape once the index API is in place.