Skip to main content

Module runtime_dispatch

Module runtime_dispatch 

Source
Expand description

Centralized ISA runtime dispatch codegen for OxiFFT SIMD codelets.

This module generates cached runtime ISA dispatchers that extend the inline dispatchers in super with an AtomicU8-based ISA level cache.

§Motivation

The basic dispatchers emitted by super::gen_dispatcher perform is_x86_feature_detected! / is_aarch64_feature_detected! on every call. While each call is cheap (typically one CPUID cache read), a hot codelet invoked millions of times per second may benefit from the cached path, which replaces repeated feature probes with a single AtomicU8 load.

§Priority order (high → low)

x86_64: AVX-512F > AVX2+FMA > AVX > SSE2 > scalar
aarch64: NEON > scalar
other: scalar

§Generated code shape

For each (size, precision) pair, the proc-macro emits:

  • ISA level constants (ISA_SCALAR, ISA_SSE2, … ISA_UNDETECTED)
  • A static DETECTED_ISA_{size}_{TY}: AtomicU8 initialized to ISA_UNDETECTED
  • A private detect_isa_{size}_{ty}() -> u8 function that probes the CPU once
  • A public {fn_name}_cached(data, sign) dispatcher that reads the cache first

§Proc-macro entry

// Generates a cached dispatcher for size-4 f32.
gen_dispatcher_codelet!(size = 4, ty = f32);

Re-exports§

pub use super::multi_transform::Precision;

Structs§

DispatcherConfig
Configuration for a cached runtime ISA dispatcher codelet.

Constants§

ISA_AVX
ISA level for pure AVX (no FMA, no AVX2).
ISA_AVX2_FMA
ISA level for AVX2 + FMA.
ISA_AVX512
ISA level for AVX-512F.
ISA_NEON
ISA level for NEON (aarch64).
ISA_SCALAR
ISA level for scalar fallback.
ISA_SSE2
ISA level for SSE2.
ISA_UNDETECTED
Sentinel: ISA not yet detected (stored in the AtomicU8 before first call).

Functions§

detect_host_isa
Detect the best ISA available on the current host at runtime.
generate_dispatcher
Generate a cached runtime ISA dispatcher TokenStream.
generate_from_macro
Entry point for the gen_dispatcher_codelet! proc-macro.