Expand description
§Bitmask SIMD Kernels - Vectorised High-Performance Bitmask Operations
SIMD-accelerated implementations of bitmask operations using portable vectorisation with std::simd.
These kernels provide optimal performance for large bitmask operations through
SIMD-parallel processing of multiple 64-bit words simultaneously.
§Overview
This module contains vectorised implementations of all bitmask operations. it uses configurable SIMD lane counts to adapt to different CPU architectures whilst maintaining code portability.
We do not check for SIMD alignment here because it is guaranteed by the Bitmask as it is backed by Minarrow’s Vec64.
§Architecture Principles
- Portable SIMD: Uses
std::simdfor cross-platform vectorisation without target-specific code - Configurable lanes: Lane counts determined at build time for optimal performance per architecture
- Hybrid processing: SIMD inner loops with scalar tail handling for non-aligned lengths
- Low-cost abstraction:
Bitmaskis a light-weight structure over aVec64. See Minarrow for details and benchmarks demonstrating very low abstraction cost.
§Memory Access Patterns
- Vectorised loads process multiple words per memory operation
- Sequential access patterns optimise cache utilisation
- Aligned access where possible for maximum performance
- Streaming patterns for large bitmask operations
§Specialised Algorithms
§Population Count (Popcount)
Uses SIMD reduction for optimal performance:
ⓘ
let counts = simd_vector.count_ones();
total += counts.reduce_sum() as usize;§Equality Testing
Leverages SIMD comparison operations:
ⓘ
let eq_mask = vector_a.simd_eq(vector_b);
if !eq_mask.all() { return false; }Functions§
- all_
eq_ mask_ simd - Vectorised equality test across entire bitmask windows with early termination optimisation.
- all_
false_ mask_ simd - Returns true if all bits in the mask are set to (0).
- all_
ne_ mask_ simd - Tests if all corresponding bits between two bitmask windows are different.
- all_
true_ mask_ simd - Returns true if all bits in the mask are set (1).
- and_
masks_ simd - Performs vectorised bitwise AND operation between two bitmask windows.
- bitmask_
binop_ simd - Primitive bit ops Performs vectorised bitwise binary operations (AND/OR/XOR) with configurable lane counts.
- bitmask_
unop_ simd - Performs vectorised bitwise unary operations (NOT) with configurable lane counts.
- eq_
mask_ simd - Produces a bitmask where each output bit is 1 iff the corresponding bits of
aandbare equal. - in_
mask_ simd - Bitwise “in” for boolean bitmasks: each output bit is true if lhs bit is in the set of bits in rhs.
- ne_
mask_ simd - Performs vectorised bitwise inequality comparison between two bitmask windows.
- not_
in_ mask_ simd - Performs vectorised bitwise “not in” membership test for boolean bitmasks.
- not_
mask_ simd - Performs vectorised bitwise NOT operation on a bitmask window.
- or_
masks_ simd - Performs vectorised bitwise OR operation between two bitmask windows.
- popcount_
mask_ simd - Vectorised population count (number of set bits) with SIMD reduction for optimal performance.
- xor_
masks_ simd - Performs vectorised bitwise XOR operation between two bitmask windows.