#[arcane]macros only.Expand description
Mark a function as an arcane SIMD function.
This macro enables safe use of SIMD intrinsics by generating an inner function
with the appropriate #[target_feature(enable = "...")] attributes based on
the token parameter type. The outer function calls the inner function unsafely,
which is justified because the token parameter proves the features are available.
The token is passed through to the inner function, so you can call other
token-taking functions from inside #[arcane].
§Token Parameter Forms
The macro supports four forms of token parameters:
§Concrete Token Types
#[arcane]
fn process(token: Avx2Token, data: &[f32; 8]) -> [f32; 8] {
// AVX2 intrinsics safe here
}§impl Trait Bounds
#[arcane]
fn process(token: impl HasAvx2, data: &[f32; 8]) -> [f32; 8] {
// Accepts any token that provides AVX2
}§Generic Type Parameters
#[arcane]
fn process<T: HasAvx2>(token: T, data: &[f32; 8]) -> [f32; 8] {
// Generic over any AVX2-capable token
}
// Also works with where clauses:
#[arcane]
fn process<T>(token: T, data: &[f32; 8]) -> [f32; 8]
where
T: HasAvx2
{
// ...
}§Methods with Self Receivers
Methods with self, &self, &mut self receivers are supported via the
_self = Type argument. Use _self in the function body instead of self:
use archmage::{HasAvx2, arcane};
use wide::f32x8;
trait Avx2Ops {
fn double(&self, token: impl HasAvx2) -> Self;
fn square(self, token: impl HasAvx2) -> Self;
fn scale(&mut self, token: impl HasAvx2, factor: f32);
}
impl Avx2Ops for f32x8 {
#[arcane(_self = f32x8)]
fn double(&self, _token: impl HasAvx2) -> Self {
// Use _self instead of self in the body
*_self + *_self
}
#[arcane(_self = f32x8)]
fn square(self, _token: impl HasAvx2) -> Self {
_self * _self
}
#[arcane(_self = f32x8)]
fn scale(&mut self, _token: impl HasAvx2, factor: f32) {
*_self = *_self * f32x8::splat(factor);
}
}Why _self? The macro generates an inner function where self becomes
a regular parameter named _self. Using _self in your code reminds you
that you’re not using the normal self keyword.
All receiver types are supported:
self(by value/move) →_self: Type&self(shared reference) →_self: &Type&mut self(mutable reference) →_self: &mut Type
§Multiple Trait Bounds
When using impl Trait or generic bounds with multiple traits,
all required features are enabled:
#[arcane]
fn fma_kernel(token: impl HasAvx2 + HasFma, data: &[f32; 8]) -> [f32; 8] {
// Both AVX2 and FMA intrinsics are safe here
}§Expansion
The macro expands to approximately:
fn process(token: Avx2Token, data: &[f32; 8]) -> [f32; 8] {
#[target_feature(enable = "avx2")]
#[inline]
unsafe fn __simd_inner_process(token: Avx2Token, data: &[f32; 8]) -> [f32; 8] {
let v = unsafe { _mm256_loadu_ps(data.as_ptr()) };
let doubled = _mm256_add_ps(v, v);
let mut out = [0.0f32; 8];
unsafe { _mm256_storeu_ps(out.as_mut_ptr(), doubled) };
out
}
// SAFETY: Token proves the required features are available
unsafe { __simd_inner_process(token, data) }
}§Profile Tokens
Profile tokens automatically enable all required features:
#[arcane]
fn kernel(token: X64V3Token, data: &mut [f32]) {
// AVX2 + FMA + BMI1 + BMI2 intrinsics all safe here!
}§Supported Tokens
- x86_64:
Sse2Token,Sse41Token,Sse42Token,AvxToken,Avx2Token,FmaToken,Avx2FmaToken,Avx512fToken,Avx512bwToken - x86_64 profiles:
X64V2Token,X64V3Token,X64V4Token - ARM:
NeonToken,SveToken,Sve2Token - WASM:
Simd128Token
§Supported Trait Bounds
- x86_64:
HasSse,HasSse2,HasSse41,HasSse42,HasAvx,HasAvx2,HasAvx512f,HasAvx512vl,HasAvx512bw,HasAvx512vbmi2,HasFma - ARM:
HasNeon,HasSve,HasSve2 - Generic:
Has128BitSimd,Has256BitSimd,Has512BitSimd
§Options
§inline_always
Use #[inline(always)] instead of #[inline] for the inner function.
This can improve performance by ensuring aggressive inlining, but requires
nightly Rust with #![feature(target_feature_inline_always)] enabled in
the crate using the macro.
#![feature(target_feature_inline_always)]
#[arcane(inline_always)]
fn fast_kernel(token: Avx2Token, data: &mut [f32]) {
// Inner function will use #[inline(always)]
}