archmage
Safely invoke your intrinsic power, using the tokens granted to you by the CPU.
archmage provides zero-cost capability tokens that prove CPU features are available at runtime, making raw SIMD intrinsics safe to call via the #[arcane] macro.
Quick Start
[]
= "0.2"
= "0.2" # For safe memory operations
use ;
use *;
How It Works
SIMD intrinsics are unsafe for two reasons:
- Feature availability: Calling AVX2 instructions on a CPU without AVX2 is undefined behavior
- Memory operations: Load/store intrinsics use raw pointers
archmage solves #1 with capability tokens - zero-sized types that can only be created after runtime CPU detection succeeds:
// summon() checks CPUID and returns Some only if features are available
if let Some = summon
The #[arcane] macro transforms your function to enable #[target_feature], which makes value-based intrinsics safe (Rust 1.85+):
For memory operations (#2), use the safe_unaligned_simd crate which provides reference-based alternatives.
Token Reference
x86-64 Tokens
Start with Desktop64 for most applications:
| Token | Features | CPU Support |
|---|---|---|
Desktop64 |
AVX2 + FMA + BMI2 | Intel Haswell 2013+, AMD Zen 1 2017+ |
X64V2Token |
SSE4.2 + POPCNT | Intel Nehalem 2008+, AMD Bulldozer 2011+ |
X64V3Token |
AVX2 + FMA + BMI2 | Same as Desktop64 (alias) |
Individual feature tokens for fine-grained control:
| Token | Features |
|---|---|
Avx2FmaToken |
AVX2 + FMA |
Avx2Token |
AVX2 only |
FmaToken |
FMA only |
AvxToken |
AVX |
Sse42Token |
SSE4.2 |
Sse41Token |
SSE4.1 |
x86-64 AVX-512 Tokens (requires avx512 feature)
[]
= { = "0.2", = ["avx512"] }
| Token | Features | CPU Support |
|---|---|---|
X64V4Token |
AVX-512 F/BW/CD/DQ/VL | Intel Skylake-X 2017+, AMD Zen 4 2022+ |
Avx512ModernToken |
+ VBMI2, VNNI, BF16, etc. | Intel Ice Lake 2019+, AMD Zen 4+ |
Avx512Fp16Token |
+ FP16 | Intel Sapphire Rapids 2023+ |
Note: Intel 12th-14th gen consumer CPUs do NOT have AVX-512.
ARM Tokens
| Token | Features | CPU Support |
|---|---|---|
Arm64 |
NEON | All AArch64 (baseline) |
NeonToken |
NEON | Same as Arm64 (alias) |
NeonAesToken |
NEON + AES | ARM with crypto extensions |
NeonSha3Token |
NEON + SHA3 | ARMv8.2+ |
ArmCryptoToken |
AES + SHA2 + CRC | Most ARMv8 CPUs |
ArmCrypto3Token |
+ SHA3 | ARMv8.4+ (M1/M2/M3, Graviton 2+) |
WASM Tokens
| Token | Features |
|---|---|
Simd128Token |
WASM SIMD |
Token Hierarchy
Tokens form a hierarchy. Higher-level tokens can extract lower-level ones:
if let Some = summon
Trait Bounds
Use trait bounds for generic SIMD code:
use ;
// Accept any token with at least v2 features
Available traits:
| Trait | Meaning |
|---|---|
SimdToken |
Base trait for all tokens |
HasX64V2 |
Has SSE4.2 + POPCNT |
HasX64V4 |
Has AVX-512 (requires avx512 feature) |
Has128BitSimd |
Has 128-bit vectors |
Has256BitSimd |
Has 256-bit vectors |
Has512BitSimd |
Has 512-bit vectors |
HasNeon |
Has ARM NEON |
HasNeonAes |
Has NEON + AES |
HasNeonSha3 |
Has NEON + SHA3 |
Cross-Platform Code
All tokens compile on all platforms. summon() returns None on unsupported architectures:
use ;
SIMD Types
archmage provides token-gated SIMD types with ergonomic operators:
use ;
if let Some = summon
Available Types
| Width | Float | Signed Int | Unsigned Int | Token Required |
|---|---|---|---|---|
| 128-bit | f32x4, f64x2 |
i8x16, i16x8, i32x4, i64x2 |
u8x16, u16x8, u32x4, u64x2 |
Sse41Token |
| 256-bit | f32x8, f64x4 |
i8x32, i16x16, i32x8, i64x4 |
u8x32, u16x16, u32x8, u64x4 |
Avx2FmaToken |
| 512-bit | f32x16, f64x8 |
i8x64, i16x32, i32x16, i64x8 |
u8x64, u16x32, u32x16, u64x8 |
Avx512Token |
Operations
Construction (requires token): splat, from_array, load, zero
Extraction: to_array, as_array, store, raw
Arithmetic: +, -, *, / and assignment variants
Bitwise: &, |, ^ and assignment variants
Math (float): sqrt, abs, floor, ceil, round, min, max, clamp, mul_add, mul_sub, recip, rsqrt
Transcendentals (float): log2_lowp, log2_midp, exp2_lowp, exp2_midp, ln_lowp, ln_midp, exp_lowp, exp_midp, pow_lowp, pow_midp, cbrt_midp
Comparison: simd_eq, simd_ne, simd_lt, simd_le, simd_gt, simd_ge
Reduction: reduce_add, reduce_min, reduce_max
Integer: shl::<N>, shr::<N>, shr_arithmetic::<N>
Feature Flags
| Feature | Description |
|---|---|
std (default) |
Standard library support |
macros (default) |
#[arcane] macro |
avx512 |
AVX-512 tokens |
__composite |
Transpose, dot product (unstable) |
__wide |
wide crate integration (unstable) |
Testing Fallback Paths
Set ARCHMAGE_DISABLE=1 to force summon() to return None:
ARCHMAGE_DISABLE=1
License
MIT OR Apache-2.0
AI-Generated Code Notice
Developed with Claude (Anthropic). Review critical paths before production use.