bitpolar 0.1.0

BitPolar: near-optimal vector quantization with zero training overhead — 3-bit precision, provably unbiased inner products (ICLR 2026)
Documentation

BitPolar

Crates.io docs.rs License

Near-optimal vector quantization with zero training overhead — compress embeddings to 3-8 bits with provably unbiased inner products and no calibration data required.

Implements TurboQuant (ICLR 2026), PolarQuant (AISTATS 2026), and QJL (AAAI 2025) from Google Research.

Key Properties

  • Data-oblivious — no training, no codebooks, no calibration data
  • Deterministic — fully defined by 4 integers: (dimension, bits, projections, seed)
  • Provably unbiased — inner product estimates satisfy E[estimate] = exact at 3+ bits
  • Near-optimal — distortion within ~2.7x of the Shannon rate-distortion limit
  • Instant indexing — vectors compress on arrival, 600x faster than Product Quantization

Quick Start

use bitpolar::TurboQuantizer;
use bitpolar::traits::VectorQuantizer;

// Create quantizer from 4 integers — no training needed
let q = TurboQuantizer::new(128, 4, 32, 42).unwrap();

// Encode a vector
let vector = vec![0.1_f32; 128];
let code = q.encode(&vector).unwrap();

// Estimate inner product without decompression
let query = vec![0.05_f32; 128];
let score = q.inner_product_estimate(&code, &query).unwrap();

// Decode back to approximate vector
let reconstructed = q.decode(&code);

API Overview

Type Description
TurboQuantizer Two-stage quantizer (Polar + QJL) — the primary API
PolarQuantizer Single-stage polar coordinate encoding
QjlQuantizer 1-bit Johnson-Lindenstrauss sketching
KvCacheCompressor Transformer KV cache compression
MultiHeadKvCache Multi-head attention KV cache
DistortionTracker Online quality monitoring (EMA MSE/bias)

How It Works

Input f32 vector
    |
    v
[Random Rotation]     Haar-distributed orthogonal matrix (QR of Gaussian)
    |                  Spreads energy uniformly across coordinates
    v
[PolarQuant]          Groups d dims into d/2 pairs -> polar coords
 (Stage 1)            Radii: lossless f32 | Angles: b-bit quantized
    |
    v
[QJL Residual]        Sketches reconstruction error
 (Stage 2)            1 sign bit per projection -> unbiased correction
    |
    v
TurboCode { polar: PolarCode, residual: QjlSketch }

Inner product estimation combines both stages: <v, q> ~ IP_polar(code, q) + IP_qjl(residual_sketch, q)

Parameter Selection

Use Case Bits Projections Notes
Semantic search 4-8 dim/4 Best accuracy for retrieval
KV cache 3-6 dim/8 Memory vs attention quality
Maximum compression 3 dim/16 Still provably unbiased
Lightweight similarity -- dim/4 QJL standalone (1-bit sketches)

Feature Flags

Feature Default Description
std Yes Standard library (nalgebra QR decomposition)
serde-support Yes Serde serialization for all types
simd No Hand-tuned NEON/AVX2 kernels
parallel No Parallel batch operations via rayon
tracing-support No OpenTelemetry-compatible instrumentation

Performance

Run benchmarks:

cargo bench

Run examples:

cargo run --example vector_search
cargo run --example kv_cache

Traits

BitPolar exposes composable traits for ecosystem integration:

  • VectorQuantizer — core encode/decode/IP/L2 interface
  • BatchQuantizer — parallel batch operations (behind parallel feature)
  • RotationStrategy — pluggable rotation (QR, Walsh-Hadamard, identity)
  • SerializableCode — compact binary serialization

References

Contributing

Contributions are welcome! See CONTRIBUTING.md for development setup, coding standards, commit message conventions, and how to add a new quantization strategy.

License

Licensed under: