Expand description
SIMD-accelerated vector similarity primitives.
Fast building blocks for embedding similarity with automatic hardware dispatch.
§Which Function Should I Use?
| Task | Function | Notes |
|---|---|---|
| Similarity (normalized) | cosine | Most embeddings are normalized |
| Similarity (raw) | dot | When you know norms |
| Distance (L2) | l2_distance | For k-NN, clustering |
| Token-level matching | maxsim | ColBERT-style late interaction |
| Sparse vectors | sparse_dot | BM25 scores, SPLADE |
| INT8 embeddings | dot_u8 | Quantized vector search |
| Binary embeddings | hamming_distance | Byte-packed bit vectors |
§SIMD Dispatch
All functions automatically dispatch to the fastest available instruction set:
| Architecture | Instructions | Detection |
|---|---|---|
| x86_64 | AVX-512F | Runtime |
| x86_64 | AVX2 + FMA | Runtime |
| aarch64 | NEON | Always available |
| Other | Portable | LLVM auto-vectorizes |
Vectors shorter than 16 dimensions use portable code (SIMD overhead not worthwhile).
§Historical Context
The inner product (dot product) dates to Grassmann’s 1844 “Ausdehnungslehre” and Hamilton’s quaternions, formalized in Gibbs and Heaviside’s vector calculus (~1880s). Modern embedding similarity (Word2Vec 2013, BERT 2018) relies on inner products in high-dimensional spaces where SIMD acceleration is essential.
ColBERT’s MaxSim (Khattab & Zaharia, 2020) extends this to token-level late interaction, requiring O(|Q| x |D|) inner products per query-document pair.
§Example
use innr::{dot, cosine, norm};
let a = [1.0_f32, 0.0, 0.0];
let b = [0.707, 0.707, 0.0];
// Dot product
let d = dot(&a, &b);
assert!((d - 0.707).abs() < 0.01);
// Cosine similarity (normalized dot product)
let c = cosine(&a, &b);
assert!((c - 0.707).abs() < 0.01);
// L2 norm
let n = norm(&a);
assert!((n - 1.0).abs() < 1e-6);§References
- Gibbs, J.W. (1881). “Elements of Vector Analysis”
- Mikolov et al. (2013). “Efficient Estimation of Word Representations” (Word2Vec)
- Khattab & Zaharia (2020). “ColBERT: Efficient and Effective Passage Search”
Re-exports§
pub use dense::angular_distance;pub use dense::cosine;pub use dense::dot;pub use dense::l1_distance;pub use dense::l2_distance;pub use dense::l2_distance_squared;pub use dense::matryoshka_cosine;pub use dense::matryoshka_dot;pub use dense::norm;pub use dense::normalize;pub use binary::binary_dot;pub use binary::binary_hamming;pub use binary::binary_jaccard;pub use binary::encode_binary;pub use binary::PackedBinary;pub use fast_math::fast_cosine;pub use fast_math::fast_cosine_dispatch;pub use fast_math::fast_rsqrt;pub use fast_math::fast_rsqrt_precise;pub use quant::dot_u8;pub use quant::hamming_distance;pub use topk::TopK;
Modules§
- batch
- Batch vector operations with columnar (PDX-style) layout. Batch vector operations with columnar (PDX-style) data layout.
- binary
- Binary (1-bit) quantization: encode, Hamming distance, dot product, Jaccard. SIMD-accelerated binary (1-bit) vector operations.
- dense
- Dense vector primitives: dot, cosine, norm, L2/L1 distance, matryoshka. Dense vector operations with SIMD acceleration.
- dense_
f64 - Portable
f64vector primitives for higher-precision consumers (scientific computing, PageRank-style accumulation, statistical reductions). Mirrors thef32API indense; portable-only – SIMD acceleration is a follow-up. Portablef64vector primitives. - fast_
math - Fast math operations using hardware-aware approximations (rsqrt, NR iteration). Fast math operations using hardware-aware approximations.
- quant
- Integer quantization primitives: u8 dot product and Hamming distance. Integer quantization primitives: u8 dot product and Hamming distance.
- scalar
- Scalar quantization (uint8) for memory-efficient asymmetric similarity. Scalar quantization (uint8) for memory-efficient similarity search.
- ternary
- Ternary quantization (1.58-bit) for ultra-compressed embeddings. SIMD-accelerated ternary vector operations.
- topk
- Fixed-capacity top-K nearest neighbor tracker for ANN inner-loop use. Fixed-capacity top-K nearest neighbor tracker.
Functions§
- maxsim
- MaxSim: sum over query tokens of max dot product with any doc token.
- maxsim_
cosine - MaxSim with cosine similarity instead of dot product.
- sparse_
dot - Sparse dot product for sorted index arrays.
- sparse_
maxsim - Sparse MaxSim (SPLADE-style) scoring.