pub struct ResidualCodec { /* private fields */ }Expand description
Residual quantization codec for compressing token embeddings.
The codec learns centroids via k-means clustering, then quantizes the residuals (v - centroid) to a small number of bits per dimension.
§Compression Process
- Find nearest centroid for input vector
- Compute residual = vector - centroid
- Quantize each dimension to
nbitsusing learned bucket boundaries - Pack quantized values into bytes
§Scoring
Score computation avoids full decompression:
q · v ≈ q · c + Σ_d q[d] × bucket_weight[d, code[d]]Implementations§
Source§impl ResidualCodec
impl ResidualCodec
Sourcepub fn train(
embeddings: &[f32],
dim: usize,
num_centroids: usize,
nbits: u8,
iterations: usize,
) -> Result<Self>
pub fn train( embeddings: &[f32], dim: usize, num_centroids: usize, nbits: u8, iterations: usize, ) -> Result<Self>
Train a codec from sample embeddings.
§Arguments
embeddings- Flattened sample embeddings [n × dim]dim- Embedding dimensionnum_centroids- Number of k-means centroidsnbits- Bits per dimension (2 or 4)iterations- K-means iterations
§Errors
Returns an error if training data is insufficient or parameters invalid.
Sourcepub fn with_params(
centroids: Vec<f32>,
num_centroids: usize,
dim: usize,
bucket_cutoffs: Vec<f32>,
bucket_weights: Vec<f32>,
nbits: u8,
) -> Self
pub fn with_params( centroids: Vec<f32>, num_centroids: usize, dim: usize, bucket_cutoffs: Vec<f32>, bucket_weights: Vec<f32>, nbits: u8, ) -> Self
Create a codec with pre-trained parameters.
§Panics
Panics if dim == 0 (poka-yoke: division-by-zero guard).
Sourcepub fn num_centroids(&self) -> usize
pub fn num_centroids(&self) -> usize
Get the number of centroids.
Sourcepub fn packed_size(&self) -> usize
pub fn packed_size(&self) -> usize
Get the packed residual size in bytes.
Sourcepub fn find_nearest_centroid(&self, embedding: &[f32]) -> usize
pub fn find_nearest_centroid(&self, embedding: &[f32]) -> usize
Find the nearest centroid for a vector.
Sourcepub fn compress(&self, embedding: &[f32]) -> (usize, Vec<u8>)
pub fn compress(&self, embedding: &[f32]) -> (usize, Vec<u8>)
Compress an embedding to (centroid_id, packed_residual).
Sourcepub fn decompress_score(
&self,
query_token: &[f32],
centroid_id: usize,
centroid_score: f32,
packed_residual: &[u8],
) -> f32
pub fn decompress_score( &self, query_token: &[f32], centroid_id: usize, centroid_score: f32, packed_residual: &[u8], ) -> f32
Compute score between query token and compressed document token.
score ≈ q · d = q · c + q · r
§Arguments
query_token- Query embeddingcentroid_id- Assigned centroidcentroid_score- Precomputed q · cpacked_residual- Packed quantized residual
Sourcepub fn centroid_score(&self, query_token: &[f32], centroid_id: usize) -> f32
pub fn centroid_score(&self, query_token: &[f32], centroid_id: usize) -> f32
Compute dot product between query and centroid.
Trait Implementations§
Source§impl Clone for ResidualCodec
impl Clone for ResidualCodec
Source§fn clone(&self) -> ResidualCodec
fn clone(&self) -> ResidualCodec
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for ResidualCodec
impl Debug for ResidualCodec
Source§impl<'de> Deserialize<'de> for ResidualCodec
impl<'de> Deserialize<'de> for ResidualCodec
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for ResidualCodec
impl RefUnwindSafe for ResidualCodec
impl Send for ResidualCodec
impl Sync for ResidualCodec
impl Unpin for ResidualCodec
impl UnsafeUnpin for ResidualCodec
impl UnwindSafe for ResidualCodec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more