1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
//! One import for everything you need to write SIMD with archmage.
//!
//! ```rust,ignore
//! use archmage::prelude::*;
//! ```
//!
//! # What's in the prelude
//!
//! The prelude re-exports five categories of items. Together they let you write
//! complete SIMD code with a single `use` statement and zero `unsafe` blocks.
//!
//! ## 1. Traits
//!
//! - [`SimdToken`] — The core trait. Provides `summon()` for runtime detection
//! and `compiled_with()` for compile-time queries.
//! - [`IntoConcreteToken`] — Enables generic dispatch. Given an opaque token,
//! try to downcast it to a concrete platform token.
//! - Tier traits: [`HasX64V2`], [`HasX64V4`] (with `avx512`), [`HasNeon`],
//! [`HasNeonAes`], [`HasNeonSha3`] — For generic bounds when you want to
//! accept any token at a given capability level.
//! - Width traits: [`Has128BitSimd`], [`Has256BitSimd`], [`Has512BitSimd`] —
//! Deprecated; prefer concrete tokens instead.
//!
//! ## 2. Tokens
//!
//! All tokens compile on **all platforms**. On the wrong architecture,
//! `summon()` returns `None` and the token type is a zero-sized stub. This
//! means you rarely need `#[cfg(target_arch)]` in user code.
//!
//! **Friendly aliases:**
//! - [`X64V3Token`] = [`X64V3Token`] — AVX2 + FMA (Haswell 2013+, Zen 1+)
//! - [`Server64`] = [`X64V4Token`] — + AVX-512 (with `avx512` feature)
//! - [`Arm64`] = [`NeonToken`] — NEON (all 64-bit ARM)
//!
//! Also includes: [`ScalarToken`] (always available), [`X64V2Token`],
//! [`X64CryptoToken`] (V2 + PCLMULQDQ + AES-NI),
//! [`X64V3CryptoToken`] (V3 + VPCLMULQDQ + VAES, Zen 3+/Alder Lake+),
//! [`Wasm128Token`], [`Wasm128RelaxedToken`],
//! [`NeonAesToken`], [`NeonSha3Token`], [`NeonCrcToken`],
//! and the AVX-512 tokens ([`Avx512Token`], [`X64V4xToken`],
//! [`Avx512Fp16Token`]) when the `avx512` feature is enabled.
//!
//! ## 3. Macros
//!
//! Requires the `macros` feature (enabled by default).
//!
//! - [`arcane`] — The SIMD macro. Generates `#[target_feature]` wrappers
//! with automatic `#[cfg(target_arch)]` gating. Use for all SIMD functions.
//! - [`rite`] — Advanced alternative to `#[arcane]` for internal helpers.
//! Adds `#[target_feature]` + `#[inline]` directly (no wrapper). Three modes:
//! token-based (`#[rite]`), tier-based (`#[rite(v3)]` — no token needed),
//! or multi-tier (`#[rite(v3, v4, neon)]` — generates suffixed variants).
//! Optional — `#[arcane]` works for helpers too.
//! - [`incant!`] — Dispatch macro. Routes to suffixed functions (`_v3`, `_neon`,
//! `_scalar`, etc.) based on platform. See [How `incant!` works](#how-incant-works).
//! - [`autoversion`] — Auto-vectorization macro. Write plain scalar code with a
//! `SimdToken` placeholder; generates per-platform variants + runtime dispatcher.
//! No intrinsics, no SIMD types — the compiler auto-vectorizes each variant.
//! - [`magetypes`] — Type generation macro. Expands a single function into
//! per-platform variants with matching `#[cfg]` guards.
//!
//! ## 4. Platform intrinsics
//!
//! Re-exports `core::arch::{x86_64, aarch64, wasm32}::*` for the current
//! platform. Since Rust 1.85, **value-based intrinsics are safe inside
//! `#[target_feature]` functions**. This means arithmetic, shuffle, compare,
//! and bitwise intrinsics need no `unsafe` inside `#[arcane]`:
//!
//! ```rust,ignore
//! #[arcane]
//! fn add(_: X64V3Token, a: __m256, b: __m256) -> __m256 {
//! _mm256_add_ps(a, b) // Safe! No unsafe needed.
//! }
//! ```
//!
//! ## 5. Safe memory operations
//!
//! The prelude re-exports `safe_unaligned_simd` memory ops that shadow
//! `core::arch`'s pointer-based versions. These take references instead of
//! raw pointers — e.g., `_mm256_loadu_ps` takes `&[f32; 8]` instead of
//! `*const f32`.
//!
//! **Everything works unqualified:**
//!
//! ```rust,ignore
//! use archmage::prelude::*;
//!
//! #[arcane(import_intrinsics)]
//! fn load(_token: X64V3Token, data: &[f32; 8]) -> __m256 {
//! _mm256_loadu_ps(data) // Safe! Takes a reference, not a pointer.
//! }
//! ```
//!
//! The combined intrinsics module uses Rust's name resolution rules: explicit
//! `safe_unaligned_simd` re-exports shadow the glob `core::arch` imports, so
//! memory ops always resolve to the safe versions.
//!
//! # How `incant!` works
//!
//! `incant!` generates calls to suffixed variants, wrapping each in `#[cfg]`
//! gates that eliminate wrong-platform branches at compile time.
//!
//! By default, it dispatches to `_v4`, `_v3`, `_neon`, `_wasm128`, and
//! `_scalar`. You can specify explicit tiers:
//!
//! ```rust,ignore
//! incant!(func(data), [v1, v3, neon, scalar]) // explicit tiers (scalar required)
//! ```
//!
//! Known tiers: `v1`, `v2`, `x64_crypto`, `v3`, `v4`, `v4x`, `neon`, `neon_aes`,
//! `neon_sha3`, `neon_crc`, `arm_v2`, `arm_v3`, `wasm128`, `wasm128_relaxed`, `scalar`.
//!
//! **Required variants per platform (default tiers):**
//!
//! | Platform | Required | Optional |
//! |----------|----------|----------|
//! | x86_64 | `_v3`, `_scalar` | `_v4` (with `avx512` feature) |
//! | aarch64 | `_neon`, `_scalar` | — |
//! | wasm32 | `_wasm128`, `_scalar` | — |
//!
//! # Import styles
//!
//! ```rust,ignore
//! // Recommended: prelude gives you everything
//! use archmage::prelude::*;
//!
//! // Or use the combined intrinsics module directly:
//! use archmage::intrinsics::x86_64::*;
//!
//! // If you need the raw unsafe pointer version explicitly:
//! let v = unsafe { core::arch::x86_64::_mm256_loadu_ps(ptr) };
//! ```
// -- Traits --
pub use crateHasX64V2;
pub use crateHasX64V4;
pub use crateIntoConcreteToken;
pub use crateSimdToken;
pub use crate;
pub use crate;
// -- Tokens (all compile on all platforms; summon() returns None on wrong arch) --
pub use crateScalarToken;
pub use crate;
pub use crate;
pub use crate;
pub use crate;
// -- Macros --
pub use ;
// -- Platform intrinsics: core::arch types + value ops + safe memory ops --
// Uses the combined intrinsics module where safe_unaligned_simd's reference-based
// memory ops shadow core::arch's pointer-based versions automatically.
pub use crate*;
pub use crate*;
pub use crate*;
pub use crate*;