1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
//! # turbo-quant
//!
//! Experimental Rust implementation of **TurboQuant**, **PolarQuant**, and
//! **QJL** profile-defined vector quantization algorithm families.
//!
//! The crate creates derived compressed sidecars for high-dimensional vectors.
//! Quality is workload-dependent and must be measured with exact fallback gates.
//!
//! ## Key Properties
//!
//! - **Data-oblivious codec construction**: no k-means or trained codebook is
//! built inside the crate. Retrieval quality still depends on the deployment
//! distribution, filters, and workload-specific benchmark gates.
//! - **Deterministic**: identical `(dim, bits, seed)` always produces the same
//! quantizer. State can be fully reconstructed from four integers.
//! - **Measured quality**: inner product estimates are approximate and
//! retrieval deployments still need recall/rank gates.
//! - **Instant indexing**: unlike Product Quantization, there is no offline
//! training phase. Vectors can be indexed as they arrive.
//!
//! ## Quick Start
//!
//! ```rust
//! use turbo_quant::{TurboQuantizer, PolarQuantizer};
//!
//! // Compress 1536-dimensional embeddings (OpenAI/sentence-transformer size).
//! let dim = 64; // use 1536 in production
//! let q = TurboQuantizer::new(dim, 8, 32, /* seed */ 42).unwrap();
//!
//! let database_vector: Vec<f32> = vec![0.1; dim]; // your embedding here
//! let query_vector: Vec<f32> = vec![0.1; dim]; // your query here
//!
//! // Create a compressed sidecar for the database vector.
//! let code = q.encode(&database_vector).unwrap();
//!
//! // At query time: estimate inner product without decompressing.
//! let score = q.inner_product_estimate(&code, &query_vector).unwrap();
//!
//! // Or just use PolarQuant for a simpler single-stage compressor.
//! let pq = PolarQuantizer::new(dim, 8, 42).unwrap();
//! let polar_code = pq.encode(&database_vector).unwrap();
//! let polar_score = pq.inner_product_estimate(&polar_code, &query_vector).unwrap();
//! ```
//!
//! ## Choosing Parameters
//!
//! | Use case | Recommended bits | Recommended projections |
//! |---|---|---|
//! | Semantic search (recall@10) | 8 | dim / 4 |
//! | KV cache compression | 4–6 | dim / 8 |
//! | Maximum compression | 3 | dim / 16 |
//!
//! ## References
//!
//! - TurboQuant-style two-stage polar plus residual sketch compression.
//! - Polar-coordinate quantization after seeded rotation.
//! - Quantized Johnson-Lindenstrauss sign-projection sketches.
pub use ByteAccountingV1;
pub use ScalarCodebook;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;