1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
//! SIMD on Intel.
//!
//! This module provides a safe and idiomatic API for writing vectorized code
//! for Intel processors, using the SSE, AVX, and/or AVX-512 instruction sets.
//! It ensures that the running CPU supports the instructions being executed,
//! making almost every SIMD instruction safe to use. SIMD vectors are typed
//! appropriately, and custom element types can be defined if necessary.
//!
//! This interface is targeted to programmers who are experienced with Intel's
//! SIMD instructions already. Every provided operation is expected to compile
//! to a specific instruction or sequence thereof. Programmers are expected to
//! know which instructions are available and design their vectorized algorithms
//! accordingly. For a higher-level API for vectorization, look to a portable
//! SIMD interface, such as [`core::simd`].
//!
//! The following resources are crucial to this kind of SIMD programming:
//!
//! - [The Intel 64 and IA-32 Architectures Software Developer's Manual, Volume
//! 2][sdm-2] provides the complete reference documentation for every Intel
//! x86 instruction. It is incredibly useful for exploring the various SIMD
//! instruction sets and fully understanding complex SIMD instructions.
//!
//! [sdm-2]: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
//!
//! - [Félix Cloutier's x86 and amd64 instruction reference][fcl] is an online
//! copy of the Intel Software Developer's Manual, automatically constructed
//! by a "dump script". It is useful to quickly look up an instruction, but
//! should not be used as an authoritative source.
//!
//! [fcl]: https://www.felixcloutier.com/x86/
//!
//! - [The Intel Intrinsics Guide][guide] correlates Intel C intrinsics to the
//! underlying instructions, and provides partial information about how each
//! instruction works and the performance on some platforms. It is organized
//! by SIMD instruction set and is useful for exploring the instructions in
//! each set.
//!
//! [guide]: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html
//!
//! - [uops.info](https://uops.info) provides performance measurements of every
//! Intel instruction across every Intel platform. It is an incredibly useful
//! tool for understanding how a sequence of instructions will execute, based
//! on latency, throughput, and port usage, and is essential for designing an
//! efficient vectorized algorithm.
//!
//! # Architecture
//!
//!
use Freeze;
// Utility Modules:
use *;
use *;
use *;
// Basic Types:
pub use *;
pub use *;
use *;
// Extensions:
// TODO: Remove
// Utility Imports:
// Intel x86 intrinsic imports for documentation.
use *;
// Intel x86_64 intrinsic imports for documentation.
use *;
/// A SIMD-compatible element.
///
/// # Safety
///
/// A type `T` can soundly implement `Element` if and only if all of the
/// following conditions hold:
///
/// - `T` contains no niches (any bit-pattern forms a valid instance of `T`).
/// - `[T; LEN]` has the same size as `Primitive`.
/// - The alignment of `T` is less than or equal to that of `Primitive`.
/// - `[T; LEN]` can be soundly transmuted to and from `Primitive`.
/// - `T` does not contain interior mutability.
pub unsafe
pub unsafe
unsafe
/// The ability to load/store a vector from/into memory.
pub unsafe