Skip to main content

PqCodec

Struct PqCodec 

Source
pub struct PqCodec {
    pub dim: usize,
    pub m: usize,
    pub k: usize,
    pub sub_dim: usize,
    /* private fields */
}
Expand description

PQ codec with trained codebooks.

Fields§

§dim: usize

Original vector dimensionality.

§m: usize

Number of subvectors (subspaces).

§k: usize

Centroids per subvector (fixed at 256 for u8 encoding).

§sub_dim: usize

Dimensions per subvector: dim / m.

Implementations§

Source§

impl PqCodec

Source

pub fn with_governor(self, governor: Arc<MemoryGovernor>) -> Self

Attach a memory governor to this codec.

Once set, heap-significant operations (train, encode_batch, build_distance_table, decode, to_bytes) will charge the EngineId::Vector budget before allocating and release the reservation when the returned value is dropped (RAII). When no governor is set those operations proceed unconditionally, preserving backward compatibility with callers that do not use the memory governor.

The governor is a runtime concern only — it is not serialized.

Source

pub fn train( vectors: &[&[f32]], dim: usize, m: usize, k: usize, max_iter: usize, ) -> Self

Train PQ codebooks from a set of training vectors via k-means.

m = number of subvectors (must divide dim evenly). k = centroids per subvector (typically 256). max_iter = k-means iterations (20 is usually sufficient).

Source

pub fn encode(&self, vector: &[f32]) -> Vec<u8>

Encode a vector: for each subvector, find the nearest centroid index.

This is a per-vector hot-path operation. Governor charging is intentionally skipped here to avoid atomic overhead on every candidate during search; use [encode_batch] for bulk encoding with budget enforcement.

Source

pub fn encode_batch(&self, vectors: &[&[f32]]) -> Result<Vec<u8>, VectorError>

Batch encode all vectors into a contiguous byte array.

Charges m * vectors.len() bytes to the governor budget (if set) before allocating the output buffer. The guard is released at the end of this call — the buffer itself remains alive.

Source

pub fn build_distance_table( &self, query: &[f32], ) -> Result<Vec<Vec<f32>>, VectorError>

Build an asymmetric distance table for a query vector.

Returns table[sub][centroid] = distance from query’s sub-vector to each centroid. Pre-computing this table makes distance evaluation O(M) per candidate instead of O(D).

Charges m * k * size_of::<f32>() bytes to the governor (if set) before allocating the table.

Source

pub fn asymmetric_distance(&self, table: &[Vec<f32>], code: &[u8]) -> f32

Compute asymmetric distance using a precomputed distance table.

O(M) per candidate — just M table lookups and additions.

Source

pub fn decode(&self, code: &[u8]) -> Result<Vec<f32>, VectorError>

Decode a PQ code back to an approximate FP32 vector.

Charges dim * size_of::<f32>() bytes to the governor (if set) before allocating the output buffer.

Source

pub fn to_bytes(&self) -> Result<Vec<u8>, VectorError>

Serialize the codec to bytes with a versioned magic header.

Format: [NDPQ\0\0 (6 bytes)][version: u8 = 1][msgpack payload]

Charges the estimated serialized size to the governor (if set) before allocating the output buffer. The estimate is conservative: m * k * sub_dim * size_of::<f32>() + 64 (header + framing overhead).

Source

pub fn from_bytes(bytes: &[u8]) -> Result<Self, VectorError>

Deserialize the codec from bytes produced by Self::to_bytes.

Returns VectorError::InvalidMagic if the header does not match NDPQ\0\0, and VectorError::UnsupportedVersion for unknown versions.

Trait Implementations§

Source§

impl Clone for PqCodec

Source§

fn clone(&self) -> PqCodec

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for PqCodec

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for PqCodec

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl<'__msgpack_de> FromMessagePack<'__msgpack_de> for PqCodec

Source§

fn read<R: Read<'__msgpack_de>>(reader: &mut R) -> Result<Self, Error>
where Self: Sized,

Reads the MessagePack representation of this value from the provided reader.
Source§

impl Serialize for PqCodec

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl ToMessagePack for PqCodec

Source§

fn write<W: Write>(&self, writer: &mut W) -> Result<(), Error>

Writes the MessagePack representation of this value into the provided writer.
Source§

impl VectorCodec for PqCodec

Source§

type Query = PqQuery

Prepared query: precomputed distance table + original FP32 query.

Source§

fn encode(&self, v: &[f32]) -> Self::Quantized

Encode an FP32 vector: one centroid index byte per subspace.

§Panics

UnifiedQuantizedVector::new fails only when the outlier bitmask does not match the provided outlier slice. With outlier_bitmask = 0 and an empty slice this can never happen. The expect is therefore unreachable in practice.

Source§

fn prepare_query(&self, q: &[f32]) -> Self::Query

Prepare the query by precomputing the M×K asymmetric distance table.

The VectorCodec trait does not propagate errors. A PqCodec used via this trait path is created by PqCodec::train which sets no governor; build_distance_table therefore always returns Ok here. If a governor is attached and its budget is exhausted the caller that constructed the codec is responsible for handling the error — this impl panics with a descriptive message so the budget violation is never silently ignored.

Source§

fn adc_lut(&self, q: &Self::Query) -> Option<AdcLut>

Build the AdcLut from the precomputed distance table for use by SIMD rerank kernels (pshufb / vpermb).

Source§

fn fast_symmetric_distance( &self, q: &Self::Quantized, v: &Self::Quantized, ) -> f32

Symmetric distance between two PQ-encoded vectors.

Both codes are decoded to approximate FP32 vectors via the codebook, then the squared L2 difference is accumulated. This is the correct definition of symmetric PQ distance: each vector is approximated by its nearest centroids, and the distance is computed in FP32.

Source§

fn exact_asymmetric_distance(&self, q: &Self::Query, v: &Self::Quantized) -> f32

Asymmetric ADC distance: precomputed distance table vs stored code.

O(M) per candidate — delegates to PqCodec::asymmetric_distance.

Source§

type Quantized = PqQuantized

The packed quantized form. Must be convertible to a UnifiedQuantizedVector reference via AsRef.
Source§

fn train(&mut self, samples: &[&[f32]]) -> Result<(), CodecError>

Optional: fit the codec’s learned parameters on a set of training vectors. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> ArchivePointee for T

Source§

type ArchivedMetadata = ()

The archived version of the pointer metadata for this type.
Source§

fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata

Converts some archived metadata to the pointer metadata for itself.
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> LayoutRaw for T

Source§

fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>

Returns the layout of the type.
Source§

impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
where T: SharedNiching<N1, N2>, N1: Niching<T>, N2: Niching<T>,

Source§

unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool

Returns whether the given value has been niched. Read more
Source§

fn resolve_niched(out: Place<NichedOption<T, N1>>)

Writes data to out indicating that a T is niched.
Source§

impl<T> Pointee for T

Source§

type Metadata = ()

The metadata type for pointers and references to this type.
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

Source§

impl<T> FromMessagePackOwned for T
where T: for<'a> FromMessagePack<'a>,