1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
//! Core traits for ecosystem integration.
//!
//! These traits define the public contract that all quantizers implement.
//! Consumers (vector databases, inference engines) program against these
//! traits rather than concrete types, enabling pluggable quantization.
//!
//! # Trait Hierarchy
//!
//! ```text
//! VectorQuantizer — core encode/decode/estimate interface
//! └── BatchQuantizer — parallel batch operations (rayon feature)
//!
//! RotationStrategy — pluggable rotation (QR, WHT, identity)
//! SerializableCode — compact binary serialization (serde feature)
//! ```
use crateResult;
/// Core trait for any vector quantizer that produces compressed codes.
///
/// All quantizers in this crate implement this trait, providing a uniform
/// interface for vector databases and inference engines to consume.
///
/// # Thread Safety
///
/// All implementations are `Send + Sync` — quantizers are immutable after
/// construction and safe to share across threads via `Arc<dyn VectorQuantizer>`.
///
/// # Example
///
/// ```rust
/// use bitpolar::traits::VectorQuantizer;
/// use bitpolar::TurboQuantizer;
///
/// fn search<Q: VectorQuantizer>(quantizer: &Q, codes: &[Q::Code], query: &[f32]) -> Vec<f32> {
/// codes.iter()
/// .map(|code| quantizer.inner_product_estimate(code, query).unwrap_or(f32::MIN))
/// .collect()
/// }
/// ```
/// Trait for batch operations on quantized vectors.
///
/// Implementations use parallel processing (rayon) when the `parallel`
/// feature is enabled, falling back to sequential processing otherwise.
///
/// Batch operations amortize per-call overhead and are essential for
/// high-throughput Python/FFI bindings where per-vector FFI calls are expensive.
/// Trait for pluggable rotation strategies.
///
/// The default implementation uses Haar-distributed QR rotation (O(d^2)).
/// Alternative strategies include:
/// - **Walsh-Hadamard Transform (WHT):** O(d log d), used by llama.cpp
/// - **Identity rotation:** O(1), for testing or pre-whitened data
///
/// # Example
///
/// ```rust,ignore
/// struct WalshHadamardRotation { dim: usize }
///
/// impl RotationStrategy for WalshHadamardRotation {
/// fn rotate(&self, vector: &[f32]) -> Vec<f32> {
/// // O(d log d) butterfly implementation
/// todo!()
/// }
/// fn rotate_inverse(&self, vector: &[f32]) -> Vec<f32> {
/// // WHT is its own inverse (up to scaling)
/// todo!()
/// }
/// fn dim(&self) -> usize { self.dim }
/// }
/// ```
/// Trait for compact binary serialization of compressed codes.
///
/// JSON serialization (via serde) is available for debugging and interchange,
/// but compact binary is 3-10x smaller and essential for database storage.
///
/// All compact binary formats include a 1-byte version header for forward
/// compatibility.