Module bitmask

Module bitmask 

Source
Expand description

§Bitmask Kernels Module - High-Performance Null-Aware Bitmask Operations

SIMD-optimised bitmask operations for Arrow-compatible nullable array processing with efficient null handling.

§Overview

This module provides the foundational bitmask operations that enable null-aware and bit-packed boolean computing throughout the minarrow ecosystem, but can be applied to any bitmasking contenxt. These kernels handle bitwise logical operations, set membership tests, equality comparisons, and population counts on Arrow-format bitmasks with optimal performance characteristics.

§Architecture

The bitmask module follows a three-tier architecture:

  • Dispatch layer: Smart runtime selection between SIMD and scalar implementations
  • SIMD kernels: Vectorised implementations using std::simd with portable lane counts
  • Scalar kernels: High-performance but non-SIMD fallback implementations for compatibility

§Modules

  • dispatch: Runtime dispatch layer selecting SIMD vs scalar implementations based on feature flags
  • simd: SIMD-accelerated implementations using vectorised bitwise operations with configurable lane counts
  • std: Scalar fallback implementations for word-level operations on 64-bit boundaries

§Core Operations

§Logical Operations

  • and_masks: Bitwise AND across two bitmasks for intersection operations
  • or_masks: Bitwise OR across two bitmasks for union operations
  • xor_masks: Bitwise XOR across two bitmasks for symmetric difference
  • not_mask: Bitwise NOT for complement operations

§Set Membership

  • in_mask: Set inclusion tests - output bits indicate membership of LHS values in RHS set
  • not_in_mask: Set exclusion tests - complement of inclusion operations

§Equality Testing

  • eq_mask: Element-wise equality comparison producing result bitmask
  • ne_mask: Element-wise inequality comparison producing result bitmask
  • all_eq: Bulk equality test across entire bitmask windows
  • all_ne: Bulk inequality test across entire bitmask windows

§Population Analysis

  • popcount_mask: Fast population count (number of set bits) using SIMD reduction
  • all_true_mask: Test if all bits in bitmask are set to 1
  • all_false_mask: Test if all bits in bitmask are set to 0

§Arrow Compatibility

All operations maintain full compatibility with Apache Arrow’s bitmask format:

  • LSB bit ordering: Bit 0 is the least significant bit in each byte
  • Byte-packed storage: 8 bits per byte with proper alignment handling
  • Trailing bit management: Automatic masking of unused bits in final bytes
  • 64-bit word alignment: Optimised for modern CPU architectures

Modules§

dispatch
Bitmask Dispatch Module - Compile-Time SIMD/Scalar Selection for Bitmask Operations
simd
Bitmask SIMD Kernels - Vectorised High-Performance Bitmask Operations

Constants§

WORD_BITS
Number of bits in a Word for bit-level bitmask calculations.

Functions§

bitmask_window_bytes
Return window into bitmask’s bits slice covering offset..offset+len bits.
bitmask_window_bytes_mut
Return mutable window into bitmask’s bits slice covering offset..offset+len bits. Enables efficient in-place modification of bitmask regions.
clear_trailing_bits
Zero all slack bits ≥ bm.len.
mask_bits_as_words
Cast &u8 to &u64 for word-wise access.
mask_bits_as_words_mut
Cast &mut u8 to &mut u64 for word-wise mutation.
new_mask
Create zeroed bitmask of length len bits.
popcount_bits
Quick population count of true bits
words_for
Helper to compute number of u64 words required for bitmask of len bits.

Type Aliases§

Word
Fundamental word type for bitmask operations on 64-bit architectures.