Module dispatch

Module dispatch 

Source
Expand description

§Arithmetic Dispatch Module - SIMD/Scalar Dispatch Layer for Arithmetic Operations

High-performance arithmetic kernel dispatcher that automatically selects between SIMD and scalar implementations based on data alignment and feature flags.

§Overview

  • Dual-path execution: SIMD-accelerated path with scalar fallback for unaligned data
  • Type-specific dispatch: Optimised kernels for integers (i32/i64/u32/u64), floats (f32/f64), and datetime types
  • Null-aware operations: Arrow-compatible null mask propagation and handling
  • Build-time SIMD lanes: Lane counts determined at build time based on target architecture

§Supported Operations

  • Basic arithmetic: Add, subtract, multiply, divide, remainder, power
  • Fused multiply-add (FMA): Hardware-accelerated a * b + c operations for floats
  • Datetime arithmetic: Temporal operations with integer kernel delegation

§Performance Strategy

  • SIMD requires 64-byte aligned input data. This is automatic with minarrow’s Vec64.
  • Scalar fallback ensures correctness regardless of input alignment

Constants§

W8
Auto-generated SIMD lane widths from build.rs SIMD lane count for 8-bit elements (u8, i8). Determined at build time based on target architecture capabilities, or overridden via SIMD_LANES_OVERRIDE.
W16
SIMD lane count for 16-bit elements (u16, i16). Determined at build time based on target architecture capabilities, or overridden via SIMD_LANES_OVERRIDE.
W32
SIMD lane count for 32-bit elements (u32, i32, f32). Determined at build time based on target architecture capabilities, or overridden via SIMD_LANES_OVERRIDE.
W64
SIMD lane count for 64-bit elements (u64, i64, f64). Determined at build time based on target architecture capabilities, or overridden via SIMD_LANES_OVERRIDE.

Functions§

apply_datetime_i32
apply_datetime_i64
apply_datetime_u32
apply_datetime_u64
apply_float_f32
Performs element-wise float ArithmeticOperator on &[f32] using SIMD (W32 lanes) for dense/masked cases, Falls back to standard scalar ops when the simd feature is not enabled. Returns FloatArray<f32> and handles optional null-mask.
apply_float_f64
Performs element-wise float ArithmeticOperator on &[f64] using SIMD (W64 lanes) for dense/masked cases, Falls back to standard scalar ops when the simd feature is not enabled. Returns FloatArray<f64> and handles optional null-mask.
apply_fma_f32
Performs element-wise fused multiply-add (a * b + acc) on &[f32] using SIMD (W32 lanes; dense or masked, via $dense/$masked as needed. Falls back to standard scalar ops when the simd feature is not enabled. Results in a FloatArray<f32>.
apply_fma_f64
Performs element-wise fused multiply-add (a * b + acc) on &[f64] using SIMD (W64 lanes; dense or masked, via $dense/$masked as needed. Falls back to standard scalar ops when the simd feature is not enabled. Results in a FloatArray<f64>.
apply_int_i8
Performs element-wise integer ArithmeticOperator over two &[i8], SIMD-accelerated using W8 lanes if available, otherwise falls back to scalar. Returns IntegerArray<i8> with appropriate null-mask handling.
apply_int_i16
Performs element-wise integer ArithmeticOperator over two &[i16], SIMD-accelerated using W16 lanes if available, otherwise falls back to scalar. Returns IntegerArray<i16> with appropriate null-mask handling.
apply_int_i32
Performs element-wise integer ArithmeticOperator over two &[i32], SIMD-accelerated using W32 lanes if available, otherwise falls back to scalar. Returns IntegerArray<i32> with appropriate null-mask handling.
apply_int_i64
Performs element-wise integer ArithmeticOperator over two &[i64], SIMD-accelerated using W64 lanes if available, otherwise falls back to scalar. Returns IntegerArray<i64> with appropriate null-mask handling.
apply_int_u8
Performs element-wise integer ArithmeticOperator over two &[u8], SIMD-accelerated using W8 lanes if available, otherwise falls back to scalar. Returns IntegerArray<u8> with appropriate null-mask handling.
apply_int_u16
Performs element-wise integer ArithmeticOperator over two &[u16], SIMD-accelerated using W16 lanes if available, otherwise falls back to scalar. Returns IntegerArray<u16> with appropriate null-mask handling.
apply_int_u32
Performs element-wise integer ArithmeticOperator over two &[u32], SIMD-accelerated using W32 lanes if available, otherwise falls back to scalar. Returns IntegerArray<u32> with appropriate null-mask handling.
apply_int_u64
Performs element-wise integer ArithmeticOperator over two &[u64], SIMD-accelerated using W64 lanes if available, otherwise falls back to scalar. Returns IntegerArray<u64> with appropriate null-mask handling.