Skip to main content

multiwidth

Attribute Macro multiwidth 

Source
#[multiwidth]
Available on crate feature macros only.
Expand description

Generate width-specialized SIMD code.

This macro takes a module containing width-agnostic SIMD code and generates specialized versions for each target width (SSE, AVX2, AVX-512).

§Usage

use archmage::multiwidth;

#[multiwidth]
mod kernels {
    // Inside this module, these types are available:
    // - f32xN, i32xN, etc. (width-appropriate SIMD types)
    // - Token (the token type: X64V3Token for SSE/AVX2, or X64V4Token for AVX-512)
    // - LANES_F32, LANES_32, etc. (lane count constants)

    use archmage::simd::*;

    pub fn normalize(token: Token, data: &mut [f32]) {
        for chunk in data.chunks_exact_mut(LANES_F32) {
            let v = f32xN::load(token, chunk.try_into().unwrap());
            let result = v * f32xN::splat(token, 1.0 / 255.0);
            result.store(chunk.try_into().unwrap());
        }
    }
}

// Generated modules:
// - kernels::sse::normalize(token: X64V3Token, data: &mut [f32])
// - kernels::avx2::normalize(token: X64V3Token, data: &mut [f32])
// - kernels::avx512::normalize(token: X64V4Token, data: &mut [f32])  // if avx512 feature
// - kernels::normalize(data: &mut [f32])  // runtime dispatcher

§Selective Targets

You can specify which targets to generate:

#[multiwidth(avx2, avx512)]  // Only AVX2 and AVX-512, no SSE
mod fast_kernels { ... }

§How It Works

  1. The macro duplicates the module content for each width target
  2. Each copy imports from the appropriate namespace (archmage::simd::sse, etc.)
  3. The use archmage::simd::* statement is rewritten to the width-specific import
  4. A dispatcher function is generated that picks the best available at runtime

§Requirements

  • Functions should use Token as their token parameter type
  • Use f32xN, i32xN, etc. for SIMD types (not concrete types like f32x8)
  • Use LANES_F32, LANES_32, etc. for lane counts