Expand description
High-performance MaxSim scoring for late-interaction (ColBERT) workflows.
This module provides optimized CPU implementations of MaxSim scoring using:
- SIMD instructions (AVX2 on x86_64, NEON on ARM) for fast max reduction
- BLAS-accelerated matrix multiplication via ndarray (when
accelerateoropenblasfeatures enabled)
§Credits
The SIMD optimization techniques in this module are adapted from the maxsim-cpu which provides high-performance MaxSim computation for ColBERT-style late interaction models.
§Platform Support
- macOS ARM: Uses NEON SIMD + Apple Accelerate (with
acceleratefeature) - Linux x86_64: Uses AVX2 SIMD + OpenBLAS (with
openblasfeature) - Other platforms: Falls back to scalar operations
Functions§
- assign_
to_ centroids - Assign embeddings to their nearest centroids using batched GEMM.
- maxsim_
score - Compute MaxSim score for a single query-document pair.