1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
//! # archmage
//!
//! > Safely invoke your intrinsic power, using the tokens granted to you by the CPU.
//! > Cast primitive magics faster than any mage alive.
//!
//! archmage provides capability tokens that prove CPU feature availability at runtime,
//! making raw SIMD intrinsics safe to call via the `#[arcane]` macro.
//!
//! ## Quick Example
//!
//! ```rust,ignore
//! use archmage::{X64V3Token, SimdToken, arcane};
//!
//! #[arcane(import_intrinsics)]
//! fn multiply_add(_token: X64V3Token, a: &[f32; 8], b: &[f32; 8]) -> [f32; 8] {
//! // import_intrinsics brings all intrinsics + safe memory ops into scope
//! let va = _mm256_loadu_ps(a); // Takes &[f32; 8], not *const f32
//! let vb = _mm256_loadu_ps(b);
//!
//! // Value-based intrinsics are SAFE inside #[arcane]! (Rust 1.85+)
//! let result = _mm256_fmadd_ps(va, vb, va);
//!
//! let mut out = [0.0f32; 8];
//! _mm256_storeu_ps(&mut out, result);
//! out
//! }
//!
//! fn main() {
//! // X64V3Token: AVX2 + FMA + BMI2 (Haswell 2013+, Zen 1+)
//! // CPUID check elided if compiled with -C target-cpu=native
//! if let Some(token) = X64V3Token::summon() {
//! let result = multiply_add(token, &[1.0; 8], &[2.0; 8]);
//! }
//! }
//! ```
//!
//! ## Auto-Imports
//!
//! `import_intrinsics` is the recommended default — it injects
//! `archmage::intrinsics::{arch}::*` into the function body, giving you all
//! platform types, value intrinsics, and safe memory ops in one import:
//!
//! ```rust,ignore
//! use archmage::{X64V3Token, SimdToken, arcane};
//!
//! #[arcane(import_intrinsics)]
//! fn load(_token: X64V3Token, data: &[f32; 8]) -> __m256 {
//! _mm256_loadu_ps(data) // Safe! Takes &[f32; 8], not *const f32.
//! }
//! ```
//!
//! The prelude (`use archmage::prelude::*`) is still available for module-level imports.
//! See the [`prelude`] module for full documentation.
//!
//! ## How It Works
//!
//! **Capability Tokens** are zero-sized proof types created via `summon()`, which
//! checks CPUID at runtime (elided if compiled with target features enabled).
//! See [`token-registry.toml`](https://github.com/imazen/archmage/blob/main/token-registry.toml)
//! for the complete mapping of tokens to CPU features.
//!
//! **The `#[arcane]` and `#[rite]` macros** determine which `#[target_feature]`
//! attributes to emit. `#[arcane]` reads the token type from the function
//! signature. `#[rite]` works in three modes: token-based (reads the token
//! parameter), tier-based (`#[rite(v3)]` — no token needed), or multi-tier
//! (`#[rite(v3, v4, neon)]` — generates suffixed variants `fn_v3`, `fn_v4`,
//! `fn_neon`).
//!
//! Descriptive aliases are available for AI-assisted coding:
//! `#[token_target_features_boundary]` = `#[arcane]`,
//! `#[token_target_features]` = `#[rite]`,
//! `dispatch_variant!` = `incant!`.
//!
//! `#[arcane]` generates a sibling `#[target_feature]` function at the same
//! scope, plus a safe wrapper that calls it. Since both live in the same scope,
//! `self` and `Self` work naturally in methods. For trait impls, use
//! `#[arcane(_self = Type)]` (nested mode). On wrong architectures, functions
//! are cfg'd out by default. Use `incant!` for cross-arch dispatch.
//!
//! `#[rite]` applies `#[target_feature]` + `#[inline]` directly to the
//! function, with no wrapper and no boundary. It works in three modes:
//! - **Token-based** (`#[rite]`): reads the token from the function signature
//! - **Tier-based** (`#[rite(v3)]`): specifies features via tier name, no token needed
//! - **Multi-tier** (`#[rite(v3, v4, neon)]`): generates a suffixed copy for each tier
//!
//! **`#[rite]` should be your default.** Use `#[arcane]` only at entry points
//! (the first call from non-SIMD code). Token-based and tier-based produce
//! identical output — the token form can be easier to remember if you already
//! have the token in scope. Multi-tier generates one function per tier, each
//! compiled with different `#[target_feature]` attributes.
//!
//! Use concrete tokens like `X64V3Token` (AVX2+FMA) or `X64V4Token` (AVX-512).
//! For generic code, use tier traits like `HasX64V2` or `HasX64V4`.
//!
//! ## Safety
//!
//! Since Rust 1.85, value-based SIMD intrinsics (arithmetic, shuffle, compare,
//! bitwise) are safe inside `#[target_feature]` functions. Only pointer-based
//! memory operations remain unsafe — `import_intrinsics` handles this by
//! providing safe reference-based memory ops that shadow the pointer-based ones.
//!
//! Downstream crates can use `#![forbid(unsafe_code)]` when combining archmage
//! tokens + `#[arcane]`/`#[rite]` macros + `import_intrinsics`.
//!
//! ## Feature Flags
//!
//! - `std` (default): Enable std library support
//! - `avx512`: AVX-512 token support
//!
//! Macros (`#[arcane]`, `#[rite]`, `incant!`, etc.) are always available.
extern crate std;
extern crate alloc;
// Re-export macros from archmage-macros
pub use ;
// Optimized feature detection
// Core token types and traits
// Prelude: one import for tokens, traits, macros, and all intrinsics
// Combined intrinsics namespace (core::arch + safe memory ops, safe wins)
// Test utilities for exhaustive token permutation testing
// SIMD types moved to magetypes crate
// Use `magetypes::simd` for f32x8, i32x4, etc.
// ============================================================================
// Re-exports at crate root for convenience
// ============================================================================
// Core traits
pub use CompileTimeGuaranteedError;
pub use DisableAllSimdError;
pub use IntoConcreteToken;
pub use SimdToken;
// Global SIMD kill switch
pub use dangerously_disable_tokens_except_wasm;
// Width marker traits (deprecated — use concrete tokens or tier traits)
pub use ;
// x86 tier marker traits (based on LLVM x86-64 microarchitecture levels)
pub use HasX64V2;
pub use HasX64V4;
// AArch64 tier marker traits
pub use ;
// All tokens available on all architectures (summon() returns None on wrong arch)
pub use ;
// AVX-512 tokens (always available; summon() returns None on unsupported CPUs)
pub use ;