pub struct Corpus { /* private fields */ }Expand description
Aggregate root that manages a collection of compressed embedding vectors.
§Domain contracts
CompressionPolicyis immutable after the first successfulinsert.- Domain events are buffered and drained atomically via
drain_events. insert_batchis all-or-nothing: on error the corpus is not modified.
§Usage
let mut corpus = Corpus::new(
Arc::from("my-corpus"),
config,
codebook,
CompressionPolicy::Compress,
BTreeMap::new(),
);
corpus.insert(Arc::from("v1"), &vector, None)?;
let events = corpus.drain_events();Implementations§
Source§impl Corpus
impl Corpus
Sourcepub fn new(
corpus_id: CorpusId,
config: CodecConfig,
codebook: Codebook,
compression_policy: CompressionPolicy,
metadata: BTreeMap<String, EntryMetaValue>,
) -> Self
pub fn new( corpus_id: CorpusId, config: CodecConfig, codebook: Codebook, compression_policy: CompressionPolicy, metadata: BTreeMap<String, EntryMetaValue>, ) -> Self
Construct a new Corpus and emit a CorpusEvent::Created event.
timestamp must be supplied by the caller (nanoseconds since Unix
epoch) to keep this crate no_std (no std::time).
Sourcepub fn new_at(
corpus_id: CorpusId,
config: CodecConfig,
codebook: Codebook,
compression_policy: CompressionPolicy,
metadata: BTreeMap<String, EntryMetaValue>,
timestamp: Timestamp,
) -> Self
pub fn new_at( corpus_id: CorpusId, config: CodecConfig, codebook: Codebook, compression_policy: CompressionPolicy, metadata: BTreeMap<String, EntryMetaValue>, timestamp: Timestamp, ) -> Self
Construct a new Corpus with an explicit creation timestamp.
Use this variant in tests or callers that supply their own time source.
Sourcepub const fn config(&self) -> &CodecConfig
pub const fn config(&self) -> &CodecConfig
The codec configuration used at construction time.
Sourcepub const fn compression_policy(&self) -> CompressionPolicy
pub const fn compression_policy(&self) -> CompressionPolicy
The active compression policy.
Sourcepub fn vector_count(&self) -> usize
pub fn vector_count(&self) -> usize
Number of vectors currently stored in the corpus.
Sourcepub fn contains(&self, id: &VectorId) -> bool
pub fn contains(&self, id: &VectorId) -> bool
Returns true if a vector with the given id exists.
Sourcepub fn iter(&self) -> impl Iterator<Item = (&VectorId, &VectorEntry)>
pub fn iter(&self) -> impl Iterator<Item = (&VectorId, &VectorEntry)>
Iterate over (VectorId, VectorEntry) pairs in insertion order.
Sourcepub const fn metadata(&self) -> &BTreeMap<String, EntryMetaValue>
pub const fn metadata(&self) -> &BTreeMap<String, EntryMetaValue>
Read-only access to the corpus-level metadata.
Sourcepub fn drain_events(&mut self) -> Vec<CorpusEvent>
pub fn drain_events(&mut self) -> Vec<CorpusEvent>
Drain and return all pending domain events.
Uses core::mem::take — O(1), no allocation. After this call
pending_events is empty.
Sourcepub fn insert(
&mut self,
id: VectorId,
vector: &[f32],
entry_metadata: Option<EntryMetaValue>,
timestamp: Timestamp,
) -> Result<(), CorpusError>
pub fn insert( &mut self, id: VectorId, vector: &[f32], entry_metadata: Option<EntryMetaValue>, timestamp: Timestamp, ) -> Result<(), CorpusError>
Insert a single vector into the corpus.
§Errors
CorpusError::DimensionMismatchifvector.len() != config.dimension().CorpusError::DuplicateVectorIdifidis already present.CorpusError::Codecif the codec pipeline fails (only forCompresspolicy).
Sourcepub fn insert_batch(
&mut self,
vectors: &[(VectorId, &[f32], Option<EntryMetaValue>)],
timestamp: Timestamp,
) -> Result<BatchReport, CorpusError>
pub fn insert_batch( &mut self, vectors: &[(VectorId, &[f32], Option<EntryMetaValue>)], timestamp: Timestamp, ) -> Result<BatchReport, CorpusError>
Insert a batch of vectors atomically (all-or-nothing).
On failure the corpus is unchanged and no events are emitted.
On success, exactly one CorpusEvent::VectorsInserted is emitted
covering all inserted ids.
§Errors
CorpusError::BatchAtomicityFailureif any vector fails validation or compression. Theindexfield identifies the first failing vector.
Sourcepub fn decompress(&self, id: &VectorId) -> Result<Vec<f32>, CorpusError>
pub fn decompress(&self, id: &VectorId) -> Result<Vec<f32>, CorpusError>
Decompress a single vector by id.
Does not emit a domain event (Python parity).
§Errors
CorpusError::UnknownVectorIdifidis not present.CorpusError::Codecif the codec pipeline fails.
Sourcepub fn decompress_all_at(
&mut self,
timestamp: Timestamp,
) -> Result<BTreeMap<VectorId, Vec<f32>>, CorpusError>
pub fn decompress_all_at( &mut self, timestamp: Timestamp, ) -> Result<BTreeMap<VectorId, Vec<f32>>, CorpusError>
Decompress all vectors and return them with an explicit timestamp.
Emits exactly one CorpusEvent::Decompressed on success.
No event is emitted when the corpus is empty — matches Python parity.
§Errors
Propagates errors from the codec pipeline for any individual vector.
Sourcepub fn remove(&mut self, id: &VectorId) -> Option<VectorEntry>
pub fn remove(&mut self, id: &VectorId) -> Option<VectorEntry>
Remove a vector by id, returning the entry if present.
Silent: no domain event is emitted (Python parity — del corpus[id]).