1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
//! Well-typed SIMD intrinsics.
//!
//! This module provides basic redefinitions of the SIMD intrinsics from
//! [`core::arch`]. It has several benefits:
//!
//! 1. Missing instructions are implemented using inline assembly. This will
//! prevent the compiler from optimizing them as it can other intrinsics, but
//! this may be worth it if the instruction is useful enough. This is a
//! temporary measure -- [`core::arch`] should eventually provide all such
//! instructions as proper intrinsics.
//!
//! 2. It provides a comprehensive model of Intel's feature-gating conventions,
//! ensuring at compile-time that instructions are safely used when the CPU
//! has been confirmed to support them. Intrinsics for each generation are
//! implemented on a type, e.g. [`sse::Use`], which can only be created using
//! [`RuntimeSupport`] information.
//!
//! 3. Instructions use more appropriate typing. SIMD vectors are represented
//! using [`u8x16`], [`u32x4`], [`u64x8`], etc. Instructions which are
//! supposed to return booleans return [`bool`] instead of [`i32`]. Where a
//! magic immediate byte is required, it is well-typed.
//!
//! 4. Intrinsics are deduplicated appropriately. Some intrinsics can be
//! expressed in terms of others, and the optimizer is capable of noticing
//! such patterns. In these cases, the additional intrinsics are left out.
//! The exposed API is smaller and more representative of the underlying
//! instruction set.
//!
//! # Usage
//!
//! 1. Construct a [`RuntimeSupport`], using [`RuntimeSupport::detect()`]. This
//! will interrogate the running CPU for which SIMD features it supports.
//!
//! 2. Pick a SIMD generation to use: SSE, AVX, or AVX-512. This selects which
//! instructions will be available and how they are encoded in machine code.
//! Each generation has a sub-module here, e.g. [`sse`].
//!
//! 3. Construct a feature set (using [`feature_set`]) of features that you need
//! from this generation. This can be aliased to a named type for
//! convenience.
//!
//! 4. Construct the `Use` type for the selected generation, providing it the
//! constructed feature set type. Call `new()` ([`sse::Use::new()`]) with
//! the [`RuntimeSupport`], which will check whether the current CPU actually
//! supports the required features.
//!
//! 5. Use the low-level intrinsic methods on the `Support` type.
// Utility Modules:
pub use Vector;
pub use crateintel_vector as vector;
pub use Imm;
pub use crateintel_imm as imm;
pub use RuntimeSupport;
pub use crateintel_features as feature_set;
// Utility Imports:
// Intel x86 intrinsic imports for documentation.
use *;
// Intel x86_64 intrinsic imports for documentation.
use *;
use *;
use *;
use *;
// Extension Modules:
pub use Use as SSE;
pub use Use as AVX;