# iqdb-quantize v0.3.0 — Product Quantization
**The high-compression scheme lands.** v0.3.0 adds product quantization (PQ): each vector splits into `M` subvectors, each subvector gets a learned `K`-centroid codebook, and asymmetric distance computation (ADC) scores a stored code by table lookup instead of reconstruction. At `M = 16, K = 256` a 768-dim `f32` vector compresses from 3072 bytes to 16. This is the primitive `iqdb-ivf` builds IVF-PQ on.
## What is iqdb-quantize?
The memory-efficiency layer of the iQDB vector database — three quantization schemes behind one `Quantizer` trait. Scalar (SQ8) arrived in 0.2.0; product quantization (PQ) joins it here.
## What's new in 0.3.0
### `ProductQuantizer`
Build the standard `M = 8, K = 256` shape with `new()`, or pick the geometry and training seed explicitly with `with_config`. Training learns one `K`-centroid codebook per subvector via hand-rolled k-means: k-means++ seeding, Lloyd's iterations (`MAX_ITERS = 25`, relative shift tolerance `1e-4`), `f64` accumulators downcast on commit, and deterministic empty-cluster recovery.
```rust
use iqdb_quantize::{ProductQuantizer, Quantizer};
use iqdb_types::DistanceMetric;
let mut pq = ProductQuantizer::with_config(2, 4, 7); // M = 2, K = 4, seed = 7
let training: Vec<Vec<f32>> = (0..16)
.map(|i| { let f = i as f32; vec![f, f + 1.0, f + 2.0, f + 3.0] })
.collect();
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
pq.train(&refs).expect("dim divisible by M, K <= 256");
let code = pq.quantize(&[1.0_f32, 2.0, 3.0, 4.0]).expect("quantize");
assert_eq!(code.n_subvectors(), 2);
let d = pq
.distance(&[1.0_f32, 2.0, 3.0, 4.0], &code, DistanceMetric::Euclidean)
.expect("supported metric");
assert!(d.is_finite());
```
PQ supports `Euclidean`, `DotProduct`, and `Manhattan` — each decomposes into a per-subvector sum. `Cosine` (no global norm recoverable per subvector; the documented path is to L2-normalize and use `DotProduct`) and `Hamming` (wrong code space) return `IqdbError::InvalidMetric`.
### `PqAdcTables` — batch ADC
`build_query_tables(query, metric)` precomputes the `M × K` query-to-centroid distance table once, then `PqAdcTables::distance` scores any number of `PqCode`s against it — amortizing the table cost across a whole set of codes. This is the path IVF-PQ's intra-cluster scan takes, scoring every code in every probed cluster against a single query.
```rust
use iqdb_quantize::{ProductQuantizer, Quantizer};
use iqdb_types::DistanceMetric;
let mut pq = ProductQuantizer::with_config(2, 4, 7);
let training: Vec<Vec<f32>> = (0..16)
.map(|i| { let f = i as f32; vec![f, f + 1.0, f + 2.0, f + 3.0] })
.collect();
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
pq.train(&refs).expect("ok");
let query = [1.0_f32, 2.0, 3.0, 4.0];
let tables = pq.build_query_tables(&query, DistanceMetric::Euclidean).expect("supported");
let code = pq.quantize(&query).expect("quantize");
let batch = tables.distance(&code).expect("matching M");
let single = pq.distance(&query, &code, DistanceMetric::Euclidean).expect("supported");
assert_eq!(batch.to_bits(), single.to_bits()); // distance() is a thin wrapper over this path
```
### Determinism, by contract
The same `seed` + the same training data produce byte-identical codebooks and codes on every supported platform. ADC is also exact: it returns the same value as `dequantize` + `iqdb_distance::compute` within floating-point reduction tolerance for every supported metric — both are property-tested.
### `PqCode` — owned and immutable
One `u8` centroid index per subvector, produced only by `ProductQuantizer::quantize`, with `dim`, `n_subvectors`, `len`, `is_empty`, and `as_bytes` accessors and no public mutators.
## Breaking changes
**Pre-1.0 API churn.** Everything here is additive over 0.2.0. No existing item changed shape.
## Verification
PQ adds property tests for distance finiteness across every supported metric and the `pq_adc_matches_dequantize_then_compute` invariant, plus a determinism integration test (`tests/determinism.rs`) asserting byte-identical codes from two equally-seeded quantizers. The full gate runs on the CI matrix on stable and the 1.87 MSRV:
```bash
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
```
MSRV: Rust 1.87.
## What's next
- **v0.4.0 — binary quantization + feature freeze.** The third scheme (BQ, 32× compression, Hamming distance) and the declaration that the public surface is complete.
## Installation
```toml
[dependencies]
iqdb-quantize = "0.3"
```
## Documentation
- [README](https://github.com/jamesgober/iqdb-quantize/blob/main/README.md)
- [API reference](https://github.com/jamesgober/iqdb-quantize/blob/main/docs/API.md)
- [ROADMAP](https://github.com/jamesgober/iqdb-quantize/blob/main/dev/ROADMAP.md)
- [Standards (REPS)](https://github.com/jamesgober/iqdb-quantize/blob/main/REPS.md)
- [CHANGELOG](https://github.com/jamesgober/iqdb-quantize/blob/main/CHANGELOG.md)
---
**Full diff:** [`v0.2.0...v0.3.0`](https://github.com/jamesgober/iqdb-quantize/compare/v0.2.0...v0.3.0).
**Changelog:** [`CHANGELOG.md`](https://github.com/jamesgober/iqdb-quantize/blob/main/CHANGELOG.md).