pub struct PQConfig {
pub dimension: usize,
pub num_segments: usize,
pub num_centroids: usize,
pub distance_metric: DistanceMetric,
pub training_iterations: usize,
pub seed: Option<u64>,
}Expand description
Configuration for Product Quantization.
§Parameters
dimension: The dimension of vectors to quantizenum_segments: Number of subspaces (M). Must divide dimension evenly.num_centroids: Number of centroids per subspace (K). Typically 256 (8 bits per code).distance_metric: Distance metric for codebook training and distance computation
§Memory Usage
- Codebooks:
M × K × (D/M) × 4bytes - Per-vector codes:
M × ceil(log2(K)/8)bytes
For typical settings (D=128, M=8, K=256):
- Codebooks: 8 × 256 × 16 × 4 = 128KB
- Per-vector: 8 bytes (compression ratio: 64x)
Fields§
§dimension: usizeDimension of input vectors.
num_segments: usizeNumber of subspaces (segments).
num_centroids: usizeNumber of centroids per subspace.
distance_metric: DistanceMetricDistance metric for training and search.
training_iterations: usizeNumber of training iterations for k-means.
seed: Option<u64>Random seed for reproducible training.
Implementations§
Source§impl PQConfig
impl PQConfig
Sourcepub fn new(dimension: usize, num_segments: usize) -> Self
pub fn new(dimension: usize, num_segments: usize) -> Self
Create a new PQ configuration.
§Arguments
dimension: The dimension of vectors to quantizenum_segments: Number of subspaces. Must dividedimensionevenly.
§Defaults
num_centroids: 256 (8-bit codes)distance_metric: Euclideantraining_iterations: 25seed: None (non-deterministic)
§Panics
Panics if num_segments is 0 or doesn’t divide dimension evenly.
Sourcepub const fn with_num_centroids(self, k: usize) -> Self
pub const fn with_num_centroids(self, k: usize) -> Self
Set the number of centroids per subspace.
Common values:
- 256 (8-bit codes, default)
- 65536 (16-bit codes, more accurate but larger codebooks)
Sourcepub const fn with_distance_metric(self, metric: DistanceMetric) -> Self
pub const fn with_distance_metric(self, metric: DistanceMetric) -> Self
Set the distance metric.
Sourcepub const fn with_training_iterations(self, iterations: usize) -> Self
pub const fn with_training_iterations(self, iterations: usize) -> Self
Set the number of training iterations.
Sourcepub const fn subspace_dimension(&self) -> usize
pub const fn subspace_dimension(&self) -> usize
Get the dimension of each subspace.
Sourcepub fn bits_per_code(&self) -> usize
pub fn bits_per_code(&self) -> usize
Calculate the number of bits per code.
Returns the number of bits needed to represent a centroid index.
Sourcepub fn bytes_per_code(&self) -> usize
pub fn bytes_per_code(&self) -> usize
Calculate bytes per encoded vector.
Sourcepub fn validate(&self) -> Result<(), VectorError>
pub fn validate(&self) -> Result<(), VectorError>
Validate the configuration.
§Errors
Returns an error if:
dimensionis 0num_segmentsis 0 or doesn’t dividedimensionnum_centroidsis 0
Sourcepub fn compression_ratio(&self) -> f32
pub fn compression_ratio(&self) -> f32
Calculate compression ratio compared to full f32 vectors.