#[multiwidth]Available on crate feature
macros only.Expand description
Generate width-specialized SIMD code.
This macro takes a module containing width-agnostic SIMD code and generates specialized versions for each target width (SSE, AVX2, AVX-512).
§Usage
ⓘ
use archmage::multiwidth;
#[multiwidth]
mod kernels {
// Inside this module, these types are available:
// - f32xN, i32xN, etc. (width-appropriate SIMD types)
// - Token (the token type: X64V3Token for SSE/AVX2, or X64V4Token for AVX-512)
// - LANES_F32, LANES_32, etc. (lane count constants)
use archmage::simd::*;
pub fn normalize(token: Token, data: &mut [f32]) {
for chunk in data.chunks_exact_mut(LANES_F32) {
let v = f32xN::load(token, chunk.try_into().unwrap());
let result = v * f32xN::splat(token, 1.0 / 255.0);
result.store(chunk.try_into().unwrap());
}
}
}
// Generated modules:
// - kernels::sse::normalize(token: X64V3Token, data: &mut [f32])
// - kernels::avx2::normalize(token: X64V3Token, data: &mut [f32])
// - kernels::avx512::normalize(token: X64V4Token, data: &mut [f32]) // if avx512 feature
// - kernels::normalize(data: &mut [f32]) // runtime dispatcher§Selective Targets
You can specify which targets to generate:
ⓘ
#[multiwidth(avx2, avx512)] // Only AVX2 and AVX-512, no SSE
mod fast_kernels { ... }§How It Works
- The macro duplicates the module content for each width target
- Each copy imports from the appropriate namespace (
archmage::simd::sse, etc.) - The
use archmage::simd::*statement is rewritten to the width-specific import - A dispatcher function is generated that picks the best available at runtime
§Requirements
- Functions should use
Tokenas their token parameter type - Use
f32xN,i32xN, etc. for SIMD types (not concrete types likef32x8) - Use
LANES_F32,LANES_32, etc. for lane counts