Struct lance_index::vector::pq::ProductQuantizerImpl
source · pub struct ProductQuantizerImpl<T: ArrowFloatType + Cosine + Dot + L2> {
pub num_bits: u32,
pub num_sub_vectors: usize,
pub dimension: usize,
pub metric_type: MetricType,
pub codebook: Arc<T::ArrayType>,
}Expand description
Product Quantization, optimized for [Apache Arrow] buffer memory layout.
Fields§
§num_bits: u32Number of bits for the centroids.
Only support 8, as one of u8 byte now.
num_sub_vectors: usizeNumber of sub-vectors.
dimension: usizeVector dimension.
metric_type: MetricTypeDistance type.
codebook: Arc<T::ArrayType>PQ codebook
((2 ^ nbits) * num_subvector * sub_vector_length) of f32
Use a layout that is cache / SIMD friendly to compute centroid. But not sure how to make distance lookup via PQ code lookup be cache friendly tho.
Layout:
- row: all centroids for the same sub-vector.
- column: the centroid value of the n-th sub-vector.
// Centroids for a sub-vector.
Codebook[sub_vector_id][pq_code]
Implementations§
source§impl<T: ArrowFloatType + Cosine + Dot + L2> ProductQuantizerImpl<T>
impl<T: ArrowFloatType + Cosine + Dot + L2> ProductQuantizerImpl<T>
sourcepub fn new(
m: usize,
nbits: u32,
dimension: usize,
codebook: Arc<T::ArrayType>,
metric_type: MetricType
) -> Self
pub fn new( m: usize, nbits: u32, dimension: usize, codebook: Arc<T::ArrayType>, metric_type: MetricType ) -> Self
Create a ProductQuantizer with pre-trained codebook.
pub fn num_centroids(num_bits: u32) -> usize
sourcepub fn codebook_length(num_bits: u32, num_sub_vectors: usize) -> usize
pub fn codebook_length(num_bits: u32, num_sub_vectors: usize) -> usize
Calculate codebook length.
Trait Implementations§
source§impl<T: Debug + ArrowFloatType + Cosine + Dot + L2> Debug for ProductQuantizerImpl<T>
impl<T: Debug + ArrowFloatType + Cosine + Dot + L2> Debug for ProductQuantizerImpl<T>
source§impl<T: ArrowFloatType + Cosine + Dot + L2 + 'static> ProductQuantizer for ProductQuantizerImpl<T>
impl<T: ArrowFloatType + Cosine + Dot + L2 + 'static> ProductQuantizer for ProductQuantizerImpl<T>
fn as_any(&self) -> &dyn Any
source§fn transform<'life0, 'life1, 'async_trait>(
&'life0 self,
data: &'life1 dyn Array
) -> Pin<Box<dyn Future<Output = Result<ArrayRef>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn transform<'life0, 'life1, 'async_trait>(
&'life0 self,
data: &'life1 dyn Array
) -> Pin<Box<dyn Future<Output = Result<ArrayRef>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
Transform a vector column to PQ code column.
source§fn build_distance_table(
&self,
query: &dyn Array,
code: &UInt8Array
) -> Result<ArrayRef>
fn build_distance_table( &self, query: &dyn Array, code: &UInt8Array ) -> Result<ArrayRef>
Build the distance lookup in
f32.source§fn num_sub_vectors(&self) -> usize
fn num_sub_vectors(&self) -> usize
Number of sub-vectors
fn dimension(&self) -> usize
fn codebook_as_fsl(&self) -> FixedSizeListArray
source§fn use_residual(&self) -> bool
fn use_residual(&self) -> bool
Whether to use residual as input or not.
Auto Trait Implementations§
impl<T> RefUnwindSafe for ProductQuantizerImpl<T>
impl<T> Send for ProductQuantizerImpl<T>
impl<T> Sync for ProductQuantizerImpl<T>
impl<T> Unpin for ProductQuantizerImpl<T>
impl<T> UnwindSafe for ProductQuantizerImpl<T>
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more