[−][src]Crate safe_arch
A crate that safely exposes arch intrinsics via cfg.
This crate lets you safely use CPU intrinsics. Those things in
core::arch
.
- Most of them are 100% safe to use as long as the CPU feature is available, like addition and multiplication and stuff.
- Some of them require that you uphold extra alignment requirements or whatever, which we do via the type system when necessary.
- Some of them are absolutely not safe at all because it causes UB at the LLVM level, so those things are not exposed here.
- Some of them are pointless to expose here because the
core
crate already provides the same functionality in a cross-platform way, so we skip those. - This crate works purely via
cfg
and compile time feature selection, there are no runtime checks added. This means that if you do want to do runtime feature detection and then dynamically call an intrinsic if it happens to be available, then this crate sadly isn't for you. - This crate aims to be as minimal as possible. Just exposing each intrinsic as a safe function with an easier to understand name and some minimal docs. Building higher level abstractions on top of the intrinsics is the domain of other crates.
- That said, each raw SIMD type is newtype'd as a wrapper (with a
pub
field) so that better trait impls can be provided.
Current Support
This grows slowly because there's just so many intrinsics.
- Intel (
x86
/x86_64
)- 128-bit:
sse
,sse2
,sse3
,ssse3
- 128-bit:
Compile Time CPU Target Features
At the time of me writing this, Rust enables the sse
and sse2
CPU
features by default for all i686
(x86) and x86_64
builds. Those CPU
features are built into the design of x86_64
, and you'd need a super old
x86
CPU for it to not support at least sse
and sse2
, so they're a safe
bet for the language to enable all the time. In fact, because the standard
library is compiled with them enabled, simply trying to disable those
features would actually cause ABI issues and fill your program with UB
(link).
If you want additional CPU features available at compile time you'll have to
enable them with an additional arg to rustc
. For a feature named name
you pass -C target-feature=+name
, such as -C target-feature=+sse3
for
sse3
.
You can alternately enable all target features of the current CPU with -C target-cpu=native
. This is primarily of use if you're building a program
you'll only run on your own system.
It's sometimes hard to know if your target platform will support a given
feature set, but the Steam Hardware Survey is generally
taken as a guide to what you can expect people to have available. If you
click "Other Settings" it'll expand into a list of CPU target features and
how common they are. These days, it seems that sse3
can be safely assumed,
and ssse3
, sse4.1
, and sse4.2
are pretty safe bets as well. The stuff
above 128-bit isn't as common yet, give it another few years.
Please note that executing a program on a CPU that doesn't support the target features it was compiles for is Undefined Behavior.
Currently, Rust doesn't actually support an easy way for you to check that a
feature enabled at compile time is actually available at runtime. There is
the "feature_detected" family of macros, but if you
enable a feature they will evaluate to a constant true
instead of actually
deferring the check for the feature to runtime. This means that, if you
did want a check at the start of your program, to confirm that all the
assumed features are present and error out when the assumptions don't hold,
you can't use that macro. You gotta use CPUID and check manually. rip.
Hopefully we can make that process easier in a future version of this crate.
A Note On Working With Cfg
There's two main ways to use cfg
:
- Via an attribute placed on an item, block, or expression:
#[cfg(debug_assertions)] println!("hello");
- Via a macro used within an expression position:
if cfg!(debug_assertions) { println!("hello"); }
The difference might seem small but it's actually very important:
- The attribute form will include code or not before deciding if all the items named and so forth really exist or not. This means that code that is configured via attribute can safely name things that don't always exist as long as the things they name do exist whenever that code is configured into the build.
- The macro form will include the configured code no matter what, and then
the macro resolves to a constant
true
orfalse
and the compiler uses dead code elimination to cut out the path not taken.
This crate uses cfg
via the attribute, so the functions it exposes don't
exist at all when the appropriate CPU target features aren't enabled.
Accordingly, if you plan to call this crate or not depending on what
features are enabled in the build you'll also need to control your use of
this crate via cfg attribute, not cfg macro.
Re-exports
pub use intel::*; |
Modules
intel | Types and functions for safe |
Macros
byte_shift_left_u128_immediate_m128i | Shifts all bits in the entire register left by a number of bytes. |
byte_shift_right_u128_immediate_m128i | Shifts all bits in the entire register right by a number of bytes. |
extract_i16_as_i32_m128i | Gets an |
insert_i16_from_i32_m128i | Inserts the low 16 bits of an |
shift_left_i16_immediate_m128i | Shifts all |
shift_left_i32_immediate_m128i | Shifts all |
shift_left_i64_immediate_m128i | Shifts both |
shift_right_i16_immediate_m128i | Shifts all |
shift_right_i32_immediate_m128i | Shifts all |
shift_right_u16_immediate_m128i | Shifts all |
shift_right_u32_immediate_m128i | Shifts all |
shift_right_u64_immediate_m128i | Shifts both |
shuffle_i16_high_lanes_m128i | Shuffles the higher |
shuffle_i16_low_lanes_m128i | Shuffles the lower |
shuffle_i32_m128i | Shuffles the |
shuffle_m128 | Shuffles the lanes around. |
shuffle_m128d | Shuffles the lanes around. |