pub struct VectorCollection {
pub surrogate_map: HashMap<u32, Surrogate>,
pub surrogate_to_local: HashMap<Surrogate, u32>,
pub multi_doc_map: HashMap<Surrogate, Vec<u32>>,
pub codec_dispatch: Option<CollectionCodec>,
pub payload: PayloadIndexSet,
pub arena_index: Option<u32>,
/* private fields */
}Expand description
Manages all vector segments for a single collection (one index key).
This type is !Send — owned by a single Data Plane core.
Fields§
§surrogate_map: HashMap<u32, Surrogate>Mapping from internal global vector ID → surrogate.
surrogate_to_local: HashMap<Surrogate, u32>Reverse map: surrogate → global vector ID. Used by point delete.
multi_doc_map: HashMap<Surrogate, Vec<u32>>Reverse mapping for multi-vector documents: document_surrogate → list of global vector IDs.
codec_dispatch: Option<CollectionCodec>Optional collection-level codec-dispatch index (RaBitQ or BBQ). Present only when the collection was built with a non-Sq8 quantization. Coexists with sealed segments — for codec-dispatched collections the per-segment Sq8 builder is skipped and this index is used instead.
payload: PayloadIndexSetIn-memory payload bitmap indexes for vector-primary collections.
Empty (no indexes) by default; populated at construction time from
VectorPrimaryConfig::payload_indexes.
arena_index: Option<u32>Optional dedicated memory arena index for this collection.
Set by the Data Plane after requesting a per-collection arena from
nodedb_mem::CollectionArenaRegistry. Used only for stats reporting;
the actual arena pinning is handled externally.
Implementations§
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn set_data_dir(&mut self, dir: PathBuf)
pub fn set_data_dir(&mut self, dir: PathBuf)
Set the data directory for mmap segment files.
Sourcepub fn set_ram_budget(&mut self, bytes: usize)
pub fn set_ram_budget(&mut self, bytes: usize)
Set the RAM budget for vector data (FP32 in sealed segments).
Sourcepub fn ram_usage_bytes(&self) -> usize
pub fn ram_usage_bytes(&self) -> usize
Estimate current RAM usage for vector data.
Sourcepub fn is_budget_exceeded(&self) -> bool
pub fn is_budget_exceeded(&self) -> bool
Whether the RAM budget is exceeded.
Sourcepub fn mmap_fallback_count(&self) -> u32
pub fn mmap_fallback_count(&self) -> u32
Number of segments that fell back to mmap.
Sourcepub fn mmap_segment_count(&self) -> u32
pub fn mmap_segment_count(&self) -> u32
Number of currently active mmap segments.
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn checkpoint_to_bytes(&self, kek: Option<&WalEncryptionKey>) -> Vec<u8> ⓘ
pub fn checkpoint_to_bytes(&self, kek: Option<&WalEncryptionKey>) -> Vec<u8> ⓘ
Serialize all segments for checkpointing.
When kek is Some, the MessagePack payload is wrapped in an
AES-256-GCM encrypted envelope with a SEGV preamble. When None,
raw MessagePack bytes are returned (existing plaintext format).
Returns an empty Vec on serialization failure (callers treat this as a
skip signal, consistent with the pre-existing error handling).
Sourcepub fn from_checkpoint(
bytes: &[u8],
kek: Option<&WalEncryptionKey>,
) -> Result<Option<Self>, VectorError>
pub fn from_checkpoint( bytes: &[u8], kek: Option<&WalEncryptionKey>, ) -> Result<Option<Self>, VectorError>
Restore a collection from checkpoint bytes.
kek controls the expected framing:
None→ the file must be plaintext MessagePack (starting with bytes that are NOTSEGV). If the file starts withSEGVand no key is provided, returnsErr(CheckpointEncryptedNoKey).Some(key)→ encryption is required. If the file starts withSEGV, it is decrypted withkey. If the file is plaintext, returnsErr(CheckpointPlaintextKeyRequired)— refuse to silently load unencrypted data when the operator has enabled at-rest encryption.
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn build_codec_dispatch(
&mut self,
quantization: &str,
) -> Option<&CollectionCodec>
pub fn build_codec_dispatch( &mut self, quantization: &str, ) -> Option<&CollectionCodec>
Build a codec-dispatched index over all current vectors using the requested quantization. Replaces any existing dispatch index for this collection. Idempotent.
Returns a reference to the new index, or None if the quantization
tag is not supported (falls back to per-segment Sq8/PQ paths) or there
are no vectors to train on.
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn new(dim: usize, params: HnswParams) -> Self
pub fn new(dim: usize, params: HnswParams) -> Self
Create an empty collection with the default seal threshold.
Sourcepub fn with_seal_threshold(
dim: usize,
params: HnswParams,
seal_threshold: usize,
) -> Self
pub fn with_seal_threshold( dim: usize, params: HnswParams, seal_threshold: usize, ) -> Self
Create an empty collection with an explicit seal threshold.
Sourcepub fn with_index_config(dim: usize, config: IndexConfig) -> Self
pub fn with_index_config(dim: usize, config: IndexConfig) -> Self
Create an empty collection with a full index configuration.
Sourcepub fn with_seal_threshold_and_config(
dim: usize,
config: IndexConfig,
seal_threshold: usize,
) -> Self
pub fn with_seal_threshold_and_config( dim: usize, config: IndexConfig, seal_threshold: usize, ) -> Self
Create an empty collection with a full index config and custom seal threshold.
Sourcepub fn with_seed(dim: usize, params: HnswParams, _seed: u64) -> Self
pub fn with_seed(dim: usize, params: HnswParams, _seed: u64) -> Self
Create with a specific seed (for deterministic testing).
Sourcepub fn needs_seal(&self) -> bool
pub fn needs_seal(&self) -> bool
Check if the growing segment should be sealed.
Sourcepub fn seal(&mut self, key: &str) -> Option<BuildRequest>
pub fn seal(&mut self, key: &str) -> Option<BuildRequest>
Seal the growing segment and return a build request.
Sourcepub fn complete_build(&mut self, segment_id: u32, index: HnswIndex)
pub fn complete_build(&mut self, segment_id: u32, index: HnswIndex)
Accept a completed HNSW build from the background thread.
After promoting the segment to sealed, rebuilds the collection-level
codec-dispatch index when self.quantization is RaBitQ or Bbq.
The rebuild trains over all vectors so the codec index always covers
every sealed segment.
Sourcepub fn sealed_segments(&self) -> &[SealedSegment]
pub fn sealed_segments(&self) -> &[SealedSegment]
Access sealed segments (read-only).
Sourcepub fn sealed_segments_mut(&mut self) -> &mut Vec<SealedSegment>
pub fn sealed_segments_mut(&mut self) -> &mut Vec<SealedSegment>
Access sealed segments mutably.
Sourcepub fn growing_is_empty(&self) -> bool
pub fn growing_is_empty(&self) -> bool
Whether the growing segment has no vectors.
pub fn len(&self) -> usize
pub fn live_count(&self) -> usize
pub fn is_empty(&self) -> bool
pub fn dim(&self) -> usize
pub fn params(&self) -> &HnswParams
Sourcepub fn set_params(&mut self, params: HnswParams)
pub fn set_params(&mut self, params: HnswParams)
Update HNSW parameters for future builds.
Sourcepub fn set_quantization(&mut self, q: VectorQuantization)
pub fn set_quantization(&mut self, q: VectorQuantization)
Set the collection-level quantization.
Sourcepub fn quantization(&self) -> VectorQuantization
pub fn quantization(&self) -> VectorQuantization
Return the configured quantization mode.
Sourcepub fn configure_payload_indexes(&mut self, fields: &[String])
pub fn configure_payload_indexes(&mut self, fields: &[String])
Configure payload bitmap indexes from a list of field names.
Source§impl VectorCollection
impl VectorCollection
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn insert(&mut self, vector: Vec<f32>) -> u32
pub fn insert(&mut self, vector: Vec<f32>) -> u32
Insert a vector. Returns the global vector ID.
Sourcepub fn insert_with_surrogate(
&mut self,
vector: Vec<f32>,
surrogate: Surrogate,
) -> u32
pub fn insert_with_surrogate( &mut self, vector: Vec<f32>, surrogate: Surrogate, ) -> u32
Insert a vector with an associated surrogate. The surrogate is allocated by the Control Plane before the call; the engine only stores the binding.
Sourcepub fn insert_multi_vector(
&mut self,
vectors: &[&[f32]],
document_surrogate: Surrogate,
) -> Vec<u32>
pub fn insert_multi_vector( &mut self, vectors: &[&[f32]], document_surrogate: Surrogate, ) -> Vec<u32>
Insert multiple vectors for a single document (ColBERT-style).
All N vectors are bound to the same document_surrogate.
Sourcepub fn delete_multi_vector(&mut self, document_surrogate: Surrogate) -> usize
pub fn delete_multi_vector(&mut self, document_surrogate: Surrogate) -> usize
Delete all vectors belonging to a multi-vector document.
Sourcepub fn get_surrogate(&self, vector_id: u32) -> Option<Surrogate>
pub fn get_surrogate(&self, vector_id: u32) -> Option<Surrogate>
Look up the surrogate for a global vector ID.
Sourcepub fn local_for_surrogate(&self, surrogate: Surrogate) -> Option<u32>
pub fn local_for_surrogate(&self, surrogate: Surrogate) -> Option<u32>
Resolve a surrogate back to its global vector ID, if bound.
Sourcepub fn delete_by_surrogate(&mut self, surrogate: Surrogate) -> bool
pub fn delete_by_surrogate(&mut self, surrogate: Surrogate) -> bool
Soft-delete a vector by surrogate.
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn hnsw_params(&self) -> HnswParams
pub fn hnsw_params(&self) -> HnswParams
Return the HNSW construction parameters for this collection.
Sourcepub fn growing_flat(&self) -> &FlatIndex
pub fn growing_flat(&self) -> &FlatIndex
Immutable access to the growing flat index.
The growing index holds vectors that have not yet been sealed into an HNSW segment. Its contents should be included in any full-collection rebuild that wants to produce a complete result.
Sourcepub fn replace_sealed(&mut self, new_segments: Vec<SealedSegment>)
pub fn replace_sealed(&mut self, new_segments: Vec<SealedSegment>)
Replace all sealed segments with new_segments.
Used by the concurrent REINDEX cutover: after the background thread finishes rebuilding the HNSW graph the Data Plane swaps in the single rebuilt segment. The growing segment is preserved unchanged.
Caller is responsible for ensuring that new_segments covers all
vectors that were in the old sealed set; any vector not present in the
new segments will return no result on subsequent searches until the
growing segment is sealed.
Sourcepub fn compact_tombstones(&mut self) -> usize
pub fn compact_tombstones(&mut self) -> usize
Compact tombstoned nodes from all sealed segments.
This is the same operation as compact() — the alias exists so
call sites that want to express “remove tombstones” read clearly.
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn with_pq_config(dim: usize, hnsw: HnswParams, pq_m: usize) -> Self
pub fn with_pq_config(dim: usize, hnsw: HnswParams, pq_m: usize) -> Self
Convenience constructor for PQ-configured collections.
Equivalent to building a full IndexConfig with
index_type = HnswPq and the given pq_m.
Sourcepub fn with_seal_threshold_and_pq_config(
dim: usize,
hnsw: HnswParams,
pq_m: usize,
seal_threshold: usize,
) -> Self
pub fn with_seal_threshold_and_pq_config( dim: usize, hnsw: HnswParams, pq_m: usize, seal_threshold: usize, ) -> Self
Convenience constructor for PQ-configured collections with a custom seal threshold.
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn search(
&self,
query: &[f32],
top_k: usize,
ef: usize,
) -> Vec<SearchResult>
pub fn search( &self, query: &[f32], top_k: usize, ef: usize, ) -> Vec<SearchResult>
Search across all segments, merging results by distance.
Sourcepub fn search_with_metric(
&self,
query: &[f32],
top_k: usize,
ef: usize,
metric: DistanceMetric,
) -> Vec<SearchResult>
pub fn search_with_metric( &self, query: &[f32], top_k: usize, ef: usize, metric: DistanceMetric, ) -> Vec<SearchResult>
Search across all segments using an explicit metric override.
For sealed segments with quantized codecs, the metric override is applied during candidate reranking. Growing and building segments apply it exactly via brute-force. The HNSW graph structure was built with the collection metric; using a different metric affects the scoring but not graph traversal.
Sourcepub fn search_with_bitmap_bytes_and_metric(
&self,
query: &[f32],
top_k: usize,
ef: usize,
bitmap: &[u8],
metric: DistanceMetric,
) -> Vec<SearchResult>
pub fn search_with_bitmap_bytes_and_metric( &self, query: &[f32], top_k: usize, ef: usize, bitmap: &[u8], metric: DistanceMetric, ) -> Vec<SearchResult>
Search with a pre-filter bitmap (byte-array format) and explicit metric override.
Sourcepub fn search_with_bitmap_bytes(
&self,
query: &[f32],
top_k: usize,
ef: usize,
bitmap: &[u8],
) -> Vec<SearchResult>
pub fn search_with_bitmap_bytes( &self, query: &[f32], top_k: usize, ef: usize, bitmap: &[u8], ) -> Vec<SearchResult>
Search with a pre-filter bitmap (byte-array format).
Sourcepub fn search_with_payload_filter(
&self,
query: &[f32],
top_k: usize,
ef: usize,
predicate: &FilterPredicate,
) -> (Vec<SearchResult>, bool)
pub fn search_with_payload_filter( &self, query: &[f32], top_k: usize, ef: usize, predicate: &FilterPredicate, ) -> (Vec<SearchResult>, bool)
Search with a structured payload predicate.
If predicate is fully covered by indexed fields (all leaf fields have
a bitmap index), the bitmap is built and HNSW traversal uses it as a
pre-filter.
If any field in predicate is un-indexed, the method returns
(results, false) where false signals that the predicate was NOT
applied and the caller must apply it as a post-filter. This guarantees
the un-indexed predicate is never silently dropped.
Returns (results, filter_was_applied).
Source§impl VectorCollection
impl VectorCollection
Sourcepub fn stats(&self) -> VectorIndexStats
pub fn stats(&self) -> VectorIndexStats
Collect live statistics from all segments.
Auto Trait Implementations§
impl Freeze for VectorCollection
impl !RefUnwindSafe for VectorCollection
impl !Send for VectorCollection
impl !Sync for VectorCollection
impl Unpin for VectorCollection
impl UnsafeUnpin for VectorCollection
impl UnwindSafe for VectorCollection
Blanket Implementations§
Source§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
Source§type ArchivedMetadata = ()
type ArchivedMetadata = ()
Source§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
Source§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
Source§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
Source§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
Source§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
out indicating that a T is niched.Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.