Attribute Macro arcane

Source

#[arcane]

Available on crate feature macros only.

Expand description

Mark a function as an arcane SIMD function.

This macro enables safe use of SIMD intrinsics by generating an inner function with the appropriate #[target_feature(enable = "...")] attributes based on the token parameter type. The outer function calls the inner function unsafely, which is justified because the token parameter proves the features are available.

The token is passed through to the inner function, so you can call other token-taking functions from inside #[arcane].

§Token Parameter Forms

The macro supports four forms of token parameters:

§Concrete Token Types

#[arcane]
fn process(token: Avx2Token, data: &[f32; 8]) -> [f32; 8] {
    // AVX2 intrinsics safe here
}

§impl Trait Bounds

#[arcane]
fn process(token: impl HasX64V2, data: &[f32; 8]) -> [f32; 8] {
    // Accepts any token with x86-64-v2 features (SSE4.2+)
}

§Generic Type Parameters

#[arcane]
fn process<T: HasX64V2>(token: T, data: &[f32; 8]) -> [f32; 8] {
    // Generic over any v2-capable token
}

// Also works with where clauses:
#[arcane]
fn process<T>(token: T, data: &[f32; 8]) -> [f32; 8]
where
    T: HasX64V2
{
    // ...
}

§Methods with Self Receivers

Methods with self, &self, &mut self receivers are supported via the _self = Type argument. Use _self in the function body instead of self:

use archmage::{X64V3Token, arcane};
use wide::f32x8;

trait SimdOps {
    fn double(&self, token: X64V3Token) -> Self;
    fn square(self, token: X64V3Token) -> Self;
    fn scale(&mut self, token: X64V3Token, factor: f32);
}

impl SimdOps for f32x8 {
    #[arcane(_self = f32x8)]
    fn double(&self, _token: X64V3Token) -> Self {
        // Use _self instead of self in the body
        *_self + *_self
    }

    #[arcane(_self = f32x8)]
    fn square(self, _token: X64V3Token) -> Self {
        _self * _self
    }

    #[arcane(_self = f32x8)]
    fn scale(&mut self, _token: X64V3Token, factor: f32) {
        *_self = *_self * f32x8::splat(factor);
    }
}

Why _self? The macro generates an inner function where self becomes a regular parameter named _self. Using _self in your code reminds you that you’re not using the normal self keyword.

All receiver types are supported:

self (by value/move) → _self: Type
&self (shared reference) → _self: &Type
&mut self (mutable reference) → _self: &mut Type

§Multiple Trait Bounds

When using impl Trait or generic bounds with multiple traits, all required features are enabled:

#[arcane]
fn fma_kernel(token: impl HasX64V2 + HasNeon, data: &[f32; 8]) -> [f32; 8] {
    // Cross-platform: SSE4.2 on x86, NEON on ARM
}

§Expansion

The macro expands to approximately:

fn process(token: Avx2Token, data: &[f32; 8]) -> [f32; 8] {
    #[target_feature(enable = "avx2")]
    #[inline]
    fn __simd_inner_process(token: Avx2Token, data: &[f32; 8]) -> [f32; 8] {
        let v = unsafe { _mm256_loadu_ps(data.as_ptr()) };
        let doubled = _mm256_add_ps(v, v);
        let mut out = [0.0f32; 8];
        unsafe { _mm256_storeu_ps(out.as_mut_ptr(), doubled) };
        out
    }
    // SAFETY: Calling #[target_feature] fn from non-matching context.
    // Token proves the required features are available.
    unsafe { __simd_inner_process(token, data) }
}

§Profile Tokens

Profile tokens automatically enable all required features:

#[arcane]
fn kernel(token: X64V3Token, data: &mut [f32]) {
    // AVX2 + FMA + BMI1 + BMI2 intrinsics all safe here!
}

§Supported Tokens

x86_64 tiers: X64V2Token, X64V3Token / Desktop64 / Avx2FmaToken, X64V4Token / Avx512Token / Server64, X64V4xToken, Avx512Fp16Token
ARM: NeonToken / Arm64, Arm64V2Token, Arm64V3Token, NeonAesToken, NeonSha3Token, NeonCrcToken
WASM: Wasm128Token

§Supported Trait Bounds

x86_64 tiers: HasX64V2, HasX64V4
ARM: HasNeon, HasNeonAes, HasNeonSha3, HasArm64V2, HasArm64V3

Preferred: Use concrete tokens (X64V3Token, Desktop64, NeonToken) directly. Concrete token types also work as trait bounds (e.g., impl X64V3Token).

Not supported: SimdToken and IntoConcreteToken cannot be used as token bounds because they don’t map to any CPU features. The macro needs concrete features to generate #[target_feature] attributes.

§Options

§`inline_always`

Use #[inline(always)] instead of #[inline] for the inner function. This can improve performance by ensuring aggressive inlining, but requires nightly Rust with #![feature(target_feature_inline_always)] enabled in the crate using the macro.

#![feature(target_feature_inline_always)]

#[arcane(inline_always)]
fn fast_kernel(token: Avx2Token, data: &mut [f32]) {
    // Inner function will use #[inline(always)]
}

arcane

Attribute Macro arcane Copy item path