Expand description
SIMD-accelerated filter kernels returning u64 bitmasks.
Each kernel compares a column slice against a scalar and returns a
packed Vec<u64> where bit i is set iff the predicate holds for
element i. One u64 word covers 64 rows.
Runtime CPU detection selects the fastest path:
- AVX-512 (512-bit, 16 u32 / 8 f64|i64 per op)
- AVX2 (256-bit, 8 u32 / 4 f64|i64 per op)
- Scalar fallback (auto-vectorized by LLVM)
Companion helpers: popcount, bitmask_and, bitmask_or, bitmask_to_indices.
Structs§
- Filter
Simd Runtime - SIMD runtime for filter-to-bitmask operations.
Functions§
- bitmask_
all - Create an all-ones bitmask for
row_countrows. - bitmask_
and - Bitwise AND of two equal-length bitmasks, SIMD-accelerated.
- bitmask_
not - Bitwise NOT of a bitmask (flips all bits up to
row_count). - bitmask_
or - Bitwise OR of two bitmasks, SIMD-accelerated. If the slices differ in length, the longer tail is copied as-is.
- bitmask_
to_ indices - Expand bitmask to a selection vector of row indices.
- filter_
runtime - Get the global filter SIMD runtime.
- popcount
- Count total set bits across a bitmask.
- words_
for - Number of u64 words needed for
row_countbits.