pub struct LazyFlatVectorData {
pub dim: usize,
pub num_vectors: usize,
pub quantization: DenseVectorQuantization,
pub doc_ids: Vec<(u32, u16)>,
/* private fields */
}Expand description
Lazy flat vector data — doc_ids in memory, vectors accessed via range reads.
Only the doc_id index (~6 bytes/vector) is loaded into memory. Vector data stays on disk and is accessed via mmap-backed range reads. Element size depends on quantization: f32=4, f16=2, uint8=1 bytes/dim.
Used for:
- Brute-force search (batched scoring with native-precision SIMD)
- Reranking (read individual vectors by doc_id via binary search)
- doc() hydration (dequantize to f32 for stored documents)
- Merge streaming (chunked raw vector bytes + doc_id iteration)
Fields§
§dim: usizeVector dimension
num_vectors: usizeTotal number of vectors
quantization: DenseVectorQuantizationStorage quantization type
doc_ids: Vec<(u32, u16)>In-memory doc_id index: (doc_id, ordinal) per vector
Implementations§
Source§impl LazyFlatVectorData
impl LazyFlatVectorData
Sourcepub async fn open(handle: LazyFileSlice) -> Result<Self>
pub async fn open(handle: LazyFileSlice) -> Result<Self>
Open from a lazy file slice pointing to the flat binary data region.
Reads header (16 bytes) + doc_ids (~6 bytes/vector) into memory. Vector data stays lazy on disk.
Sourcepub async fn read_vector_into(&self, idx: usize, out: &mut [f32]) -> Result<()>
pub async fn read_vector_into(&self, idx: usize, out: &mut [f32]) -> Result<()>
Read a single vector by index, dequantized to f32.
out must have length >= self.dim. Returns Ok(()) on success.
Used for ANN training and doc() hydration where f32 is needed.
Sourcepub async fn get_vector(&self, idx: usize) -> Result<Vec<f32>>
pub async fn get_vector(&self, idx: usize) -> Result<Vec<f32>>
Read a single vector by index, dequantized to f32 (allocates a new Vec
Sourcepub async fn read_vector_raw_into(
&self,
idx: usize,
out: &mut [u8],
) -> Result<()>
pub async fn read_vector_raw_into( &self, idx: usize, out: &mut [u8], ) -> Result<()>
Read a single vector’s raw bytes (no dequantization) into a caller-provided buffer.
out must have length >= self.vector_byte_size().
Used for native-precision reranking where raw quantized bytes are scored directly.
Sourcepub async fn read_vectors_batch(
&self,
start_idx: usize,
count: usize,
) -> Result<OwnedBytes>
pub async fn read_vectors_batch( &self, start_idx: usize, count: usize, ) -> Result<OwnedBytes>
Read a contiguous batch of raw quantized bytes by index range.
Returns raw bytes for vectors [start_idx..start_idx+count).
Bytes are in native quantized format — pass to batch_cosine_scores_f16/u8
or batch_cosine_scores (for f32) for scoring.
Sourcepub fn flat_indexes_for_doc(&self, doc_id: u32) -> (usize, &[(u32, u16)])
pub fn flat_indexes_for_doc(&self, doc_id: u32) -> (usize, &[(u32, u16)])
Find flat indexes for a given doc_id via binary search on sorted doc_ids.
doc_ids are sorted by (doc_id, ordinal) — segment builder adds docs
sequentially. Returns a slice of (doc_id, ordinal) entries; the position
of each entry in self.doc_ids is its flat vector index.
Returns (start_index, slice) where start_index is the position in doc_ids.
Sourcepub fn get_doc_id(&self, idx: usize) -> (u32, u16)
pub fn get_doc_id(&self, idx: usize) -> (u32, u16)
Get doc_id and ordinal at index (from in-memory index).
Sourcepub fn vector_byte_size(&self) -> usize
pub fn vector_byte_size(&self) -> usize
Bytes per vector in storage.
Sourcepub fn vector_bytes_len(&self) -> u64
pub fn vector_bytes_len(&self) -> u64
Total byte length of raw vector data (for chunked merger streaming).
Sourcepub fn vectors_byte_offset(&self) -> u64
pub fn vectors_byte_offset(&self) -> u64
Byte offset where vector data starts (for direct handle access in merger).
Sourcepub fn handle(&self) -> &LazyFileSlice
pub fn handle(&self) -> &LazyFileSlice
Access the underlying lazy file handle (for chunked byte-range reads in merger).
Sourcepub fn estimated_memory_bytes(&self) -> usize
pub fn estimated_memory_bytes(&self) -> usize
Estimated memory usage (only doc_ids are in memory).
Trait Implementations§
Source§impl Clone for LazyFlatVectorData
impl Clone for LazyFlatVectorData
Source§fn clone(&self) -> LazyFlatVectorData
fn clone(&self) -> LazyFlatVectorData
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl Freeze for LazyFlatVectorData
impl !RefUnwindSafe for LazyFlatVectorData
impl Send for LazyFlatVectorData
impl Sync for LazyFlatVectorData
impl Unpin for LazyFlatVectorData
impl !UnwindSafe for LazyFlatVectorData
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.