iqdb-quantize 1.0.0

# iqdb-quantize v1.0.0 — Stable

**The quantization layer is stable.** v1.0.0 commits the public API of `iqdb-quantize` under SemVer for the entire 1.x series: no breaking changes until 2.0. The three schemes every iQDB deployment dials between — scalar, product, binary, behind one `Quantizer` trait — are now a fixed point the index crates can build on without churn. Nothing in the public surface changed since 0.5.0; this release adds the consumer-simulation soak, runnable examples, and the stability commitment.

## What is iqdb-quantize?

The memory-efficiency layer of the iQDB vector database. It compresses `f32` embedding vectors into compact codes that preserve similarity-search quality — a million 768-dim vectors drop from ~3 GB to as little as ~96 MB. Three schemes share one `Quantizer` trait: scalar (SQ8, ~4×, every metric), product (PQ, up to ~192×, with batch-ADC scoring for IVF-PQ), and binary (BQ, ~32×, Hamming). Distance is asymmetric — the database is compressed, the query stays `f32` — and the recommended quality path is to search quantized, then rerank the shortlist with full precision.

```rust
use iqdb_quantize::{Quantizer, ScalarQuantizer};
use iqdb_types::DistanceMetric;

let mut sq = ScalarQuantizer::new();
sq.train(&[&[0.0_f32, 1.0, 2.0][..], &[1.0_f32, 0.0, 1.0][..]]).unwrap();

let code = sq.quantize(&[0.5_f32, 0.5, 1.5]).unwrap();       // 3 bytes from 12
let d = sq.distance(&[0.5_f32, 0.5, 1.5], &code, DistanceMetric::Cosine).unwrap();
assert!(d.is_finite());
```

## Why 1.0 now

The road here was deliberate, one verified phase at a time:

- **0.2.0 — Scalar quantization (SQ8) + the `Quantizer` trait.** The first scheme behind the trait every quantizer implements, with per-dimension affine calibration and asymmetric distance through `iqdb-distance`.
- **0.3.0 — Product quantization.** k-means codebooks per subvector and the `PqAdcTables` batch-ADC primitive — the path IVF-PQ scores through.
- **0.4.0 — Binary quantization + feature freeze.** The third scheme and the declaration that the public surface is complete.
- **0.5.0 — Recall validation + API freeze.** End-to-end recall measured against full-`f32` baselines, `tracing` instrumentation, the criterion bench harness, and the surface locked.

By 0.5.0 every Definition-of-Done criterion was met. The 0.6–0.9 RC phase exists to soak the API against live consumers; under the spine-first ordering the real consumer (`iqdb-ivf`) is not yet published against this surface, so its intent was met by a consumer-simulation built at the **exact** shape IVF-PQ uses. The crate proceeds to 1.0 on a fully-satisfied checklist rather than a calendar — the same path `iqdb-types` and `iqdb-distance` took.

## What 1.0.0 adds

### Consumer-simulation soak

`tests/consumer_simulation.rs` is a mini IVF-PQ index built **only** on the public surface: it partitions a corpus into coarse clusters, stores each member as a `PqCode`, builds the ADC tables once per query with `build_query_tables`, and scans the probed clusters through `PqAdcTables::distance` — exactly the intra-cluster scan `iqdb-ivf` performs. It asserts the contracts the consumer relies on:

- **Batch ADC equals the single-shot path** bit-for-bit, for every code and every supported metric. If `PqAdcTables::distance` ever drifted from `ProductQuantizer::distance`, this fails.
- **The index recovers the right neighbourhood** — PQ cluster purity ≥ 0.9, and a PQ-shortlist + `f32`-rerank recovers the exact top-10 (overlap ≥ 0.9); an SQ8 flat index preserves the exact top-10 directly (≥ 0.9).
- **The boundary is safe** — foreign code shapes and unsupported metrics return typed errors, never panic.

### Runnable examples

Five documented examples (`cargo run --example <name>`):

- `scalar_quantization` — SQ8 train / quantize / asymmetric distance / decode.
- `product_quantization` — PQ with `build_query_tables` batch scoring.
- `binary_quantization` — BQ and the Hamming-only contract.
- `rerank` — the search-quantized-then-rerank quality path, matching the exact answer.
- `compression` — the three schemes' code sizes side by side.

## The 1.x compatibility promise

- The public surface recorded in `dev/ROADMAP.md` is frozen until 2.0: the `Quantizer` trait, the three quantizers (`ScalarQuantizer`, `BinaryQuantizer`, `ProductQuantizer`), the three code types (`Sq8Code`, `BqCode`, `PqCode`), the `PqAdcTables` batch-ADC primitive, and `VERSION`.
- Additive, non-breaking changes remain allowed within 1.x.
- A call with a `DistanceMetric` a scheme does not support returns `IqdbError::InvalidMetric` rather than panicking — the forward-compatible handling for the `#[non_exhaustive]` enum.

## Breaking changes

**None.** No public API changed from 0.5.0; 1.0.0 is the SemVer stability commitment.

## Performance

Compression is exact and deterministic — the same `seed` + data yield byte-identical PQ codes on every platform:

| Scheme | Code (768-dim) | Compression | Metrics |
|---|---|---|---|
| SQ8 | 768 bytes | 4× | every metric |
| BQ | 96 bytes | 32× | Hamming |
| PQ (`M = 16`) | 16 bytes | 192× | Euclidean / DotProduct / Manhattan |

Per-vector throughput, benchmarked on Windows x86_64 at 768 dims (criterion medians):

| Operation | Median |
|---|---|
| SQ8 quantize | ~1.56 µs |
| SQ8 asymmetric Cosine distance | ~0.95 µs |
| BQ quantize | ~0.64 µs |
| BQ Hamming distance | ~0.68 µs |

The `f32` distance SQ8 and PQ delegate to is SIMD-accelerated transparently through `iqdb-distance` (AVX2 / NEON), so quantized search inherits those kernels without a feature flag here.

## Verification

```bash
cargo fmt --all -- --check
cargo clippy --all-targets -- -D warnings
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo audit
```

All green. **119** tests across unit, edge-case, property (`proptest`), recall, determinism, tracing, smoke, consumer-simulation, and doctest suites, plus five runnable examples. `cargo deny check` and `cargo audit` are clean. There is **zero `unsafe`** in the crate. The `loom` Definition-of-Done item is N/A by design — the quantizers own their calibration by value with no interior mutability and no lock-free path, so there is no concurrent protocol to model (recorded in `dev/ROADMAP.md`).

## What's next

`iqdb-quantize` is done. Its consumer builds on it: `iqdb-ivf` scores in-cluster codes through this stable 1.x surface (`PqAdcTables`). Anything that consumer surfaces as a genuinely useful addition will be an additive `1.0.x` release.

## Installation

```toml
[dependencies]
iqdb-quantize = "1.0"
```

MSRV: Rust 1.87.

## Documentation

- [README](https://github.com/jamesgober/iqdb-quantize/blob/main/README.md)
- [API reference](https://github.com/jamesgober/iqdb-quantize/blob/main/docs/API.md)
- [ROADMAP (frozen API record)](https://github.com/jamesgober/iqdb-quantize/blob/main/dev/ROADMAP.md)
- [Standards (REPS)](https://github.com/jamesgober/iqdb-quantize/blob/main/REPS.md)
- [CHANGELOG](https://github.com/jamesgober/iqdb-quantize/blob/main/CHANGELOG.md)
- [docs.rs/iqdb-quantize](https://docs.rs/iqdb-quantize)

---

**Full diff:** [`v0.5.0...v1.0.0`](https://github.com/jamesgober/iqdb-quantize/compare/v0.5.0...v1.0.0).
**Changelog:** [`CHANGELOG.md`](https://github.com/jamesgober/iqdb-quantize/blob/main/CHANGELOG.md).