Skip to main content

Crate bitpolar

Crate bitpolar 

Source
Expand description

§bitpolar

Zero-overhead vector quantization for semantic search and KV cache compression.

Implements three algorithms from Google Research:

  • TurboQuant (ICLR 2026) — two-stage composition for near-optimal compression
  • PolarQuant (AISTATS 2026) — polar coordinate encoding with lossless radii
  • QJL (AAAI 2025) — 1-bit Johnson-Lindenstrauss sketching

§Key Properties

  • Data-oblivious: No training, no codebooks, no calibration data
  • Deterministic: Fully defined by 4 integers: (dimension, bits, projections, seed)
  • Provably unbiased: Inner product estimates are unbiased at 3+ bits
  • Near-optimal: Within ~2.7x of Shannon rate-distortion limit
  • Instant indexing: Vectors compress on arrival — no offline training

§Quick Start

use bitpolar::TurboQuantizer;
use bitpolar::traits::VectorQuantizer;

// Create quantizer from 4 integers — no training needed
let q = TurboQuantizer::new(128, 4, 32, 42).unwrap();

// Encode a vector
let vector = vec![0.1_f32; 128];
let code = q.encode(&vector).unwrap();

// Estimate inner product without decompression
let query = vec![0.05_f32; 128];
let score = q.inner_product_estimate(&code, &query).unwrap();
println!("Estimated IP: {score}");

// Decode back to approximate vector
let reconstructed = q.decode(&code);
assert_eq!(reconstructed.len(), 128);

§Architecture

Input f32 vector
    │
    ▼
┌─────────────────┐
│ Random Rotation  │  Haar-distributed orthogonal matrix (QR of Gaussian)
│ (StoredRotation) │  Spreads energy uniformly across coordinates
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   PolarQuant     │  Groups d dims into d/2 pairs → polar coords
│  (Stage 1)       │  Radii: lossless f32 | Angles: b-bit quantized
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   QJL Residual   │  Sketches reconstruction error
│  (Stage 2)       │  1 sign bit per projection → unbiased correction
└────────┬────────┘
         │
         ▼
TurboCode { polar: PolarCode, residual_sketch: QjlSketch }

§Feature Flags

FeatureDefaultDescription
stdYesStandard library (nalgebra QR decomposition)
serde-supportYesSerde serialization for all types
simdNoHand-tuned NEON/AVX2 kernels
parallelNoParallel batch operations via rayon
tracing-supportNoOpenTelemetry-compatible instrumentation
ffiNoC FFI exports

Re-exports§

pub use error::Result;
pub use error::TurboQuantError;
pub use polar::PolarCode;
pub use polar::PolarQuantizer;
pub use qjl::QjlQuantizer;
pub use qjl::QjlSketch;
pub use turbo::TurboCode;
pub use turbo::TurboQuantizer;
pub use kv_cache::KvCacheCompressor;
pub use kv_cache::KvCacheConfig;
pub use kv_cache::MultiHeadKvCache;
pub use distortion::DistortionTracker;
pub use stats::BatchStats;
pub use stats::DistortionMetrics;
pub use rotation::StoredRotation;
pub use wht::WhtRotation;
pub use tiered::Tier;
pub use tiered::TieredCode;
pub use tiered::TieredQuantization;
pub use resilient::ResilientCode;
pub use resilient::ResilientQuantizer;
pub use search::OversampledSearch;

Modules§

adaptive
Adaptive per-vector bit-width selection with promote/demote Adaptive bit-width quantization.
distortion
Online distortion tracking for quality monitoring Online distortion tracking for quality monitoring.
error
Error types — all public APIs return Result<T, TurboQuantError> Error types for bitpolar.
kv_cache
KV cache compressor for transformer attention KV cache compressor for transformer attention.
metrics
Prometheus-compatible metrics export for monitoring Prometheus-compatible metrics export for BitPolar operations.
polar
PolarQuant: polar coordinate vector encoding (Stage 1) PolarQuant: polar coordinate vector encoding (Stage 1 of TurboQuant).
qjl
Quantized Johnson-Lindenstrauss 1-bit sketching (Stage 2) Quantized Johnson-Lindenstrauss (QJL) 1-bit sketching.
resilient
Resilient quantization with automatic primary→fallback strategy Resilient quantization with automatic fallback.
rotation
Haar-distributed orthogonal rotation matrix (O(d²) memory) Haar-distributed orthogonal rotation matrix.
search
Oversampled approximate nearest-neighbor search with exact re-ranking Oversampled approximate nearest-neighbor search.
stats
Compression statistics and quality metrics Compression statistics and quality metrics.
tiered
Tiered quantization: hot, warm, and cold storage tiers Tiered quantization: hot, warm, and cold storage tiers.
traits
Core traits for ecosystem integration: VectorQuantizer, BatchQuantizer, etc. Core traits for ecosystem integration.
turbo
TurboQuantizer: two-stage composition (Polar + QJL) TurboQuantizer: two-stage composition of PolarQuant + QJL residual.
wht
Walsh-Hadamard Transform rotation (O(d) memory, O(d log d) time) Walsh-Hadamard Transform (WHT) rotation strategy.

Constants§

COMPACT_FORMAT_VERSION
Compact binary format version for all serialized codes.