Expand description
A crate that safely exposes arch intrinsics via #[cfg()].
safe_arch lets you safely use CPU intrinsics. Those things in the
core::arch modules. It works purely via #[cfg()] and
compile time CPU feature declaration. If you want to check for a feature at
runtime and then call an intrinsic or use a fallback path based on that then
this crate is sadly not for you.
SIMD register types are “newtype’d” so that better trait impls can be given
to them, but the inner value is a pub field so feel free to just grab it
out if you need to. Trait impls of the newtypes include: Default (zeroed),
From/Into of appropriate data types, and appropriate operator
overloading.
- Most intrinsics (like addition and multiplication) are totally safe to use as long as the CPU feature is available. In this case, what you get is 1:1 with the actual intrinsic.
- Some intrinsics take a pointer of an assumed minimum alignment and
validity span. For these, the
safe_archfunction takes a reference of an appropriate type to uphold safety.- Try the bytemuck crate (and turn on the
bytemuckfeature of this crate) if you want help safely casting between reference types.
- Try the bytemuck crate (and turn on the
- Some intrinsics are not safe unless you’re very careful about how you use them, such as the streaming operations requiring you to use them in combination with an appropriate memory fence. Those operations aren’t exposed here.
- Some intrinsics mess with the processor state, such as changing the floating point flags, saving and loading special register state, and so on. LLVM doesn’t really support you messing with that within a high level language, so those operations aren’t exposed here. Use assembly or something if you want to do that.
§Naming Conventions
The safe_arch crate does not simply use the “official” names for each
intrinsic, because the official names are generally poor. Instead, the
operations have been given better names that makes things hopefully easier
to understand then you’re reading the code.
For a full explanation of the naming used, see the Naming Conventions page.
§Current Support
x86/x86_64(Intel, AMD, etc)- 128-bit:
sse,sse2,sse3,ssse3,sse4.1,sse4.2 - 256-bit:
avx,avx2 - Other:
adx,aes,bmi1,bmi2,fma,lzcnt,pclmulqdq,popcnt,rdrand,rdseed
- 128-bit:
§Compile Time CPU Target Features
At the time of me writing this, Rust enables the sse and sse2 CPU
features by default for all i686 (x86) and x86_64 builds. Those CPU
features are built into the design of x86_64, and you’d need a super old
x86 CPU for it to not support at least sse and sse2, so they’re a safe
bet for the language to enable all the time. In fact, because the standard
library is compiled with them enabled, simply trying to disable those
features would actually cause ABI issues and fill your program with UB
(link).
If you want additional CPU features available at compile time you’ll have to
enable them with an additional arg to rustc. For a feature named name
you pass -C target-feature=+name, such as -C target-feature=+sse3 for
sse3.
You can alternately enable all target features of the current CPU with -C target-cpu=native. This is primarily of use if you’re building a program
you’ll only run on your own system.
It’s sometimes hard to know if your target platform will support a given
feature set, but the Steam Hardware Survey is generally
taken as a guide to what you can expect people to have available. If you
click “Other Settings” it’ll expand into a list of CPU target features and
how common they are. These days, it seems that sse3 can be safely assumed,
and ssse3, sse4.1, and sse4.2 are pretty safe bets as well. The stuff
above 128-bit isn’t as common yet, give it another few years.
Please note that executing a program on a CPU that doesn’t support the target features it was compiles for is Undefined Behavior.
Currently, Rust doesn’t actually support an easy way for you to check that a
feature enabled at compile time is actually available at runtime. There is
the “feature_detected” family of macros, but if you
enable a feature they will evaluate to a constant true instead of actually
deferring the check for the feature to runtime. This means that, if you
did want a check at the start of your program, to confirm that all the
assumed features are present and error out when the assumptions don’t hold,
you can’t use that macro. You gotta use CPUID and check manually. rip.
Hopefully we can make that process easier in a future version of this crate.
§A Note On Working With Cfg
There’s two main ways to use cfg:
- Via an attribute placed on an item, block, or expression:
#[cfg(debug_assertions)] println!("hello");
- Via a macro used within an expression position:
if cfg!(debug_assertions) { println!("hello"); }
The difference might seem small but it’s actually very important:
- The attribute form will include code or not before deciding if all the items named and so forth really exist or not. This means that code that is configured via attribute can safely name things that don’t always exist as long as the things they name do exist whenever that code is configured into the build.
- The macro form will include the configured code no matter what, and then
the macro resolves to a constant
trueorfalseand the compiler uses dead code elimination to cut out the path not taken.
This crate uses cfg via the attribute, so the functions it exposes don’t
exist at all when the appropriate CPU target features aren’t enabled.
Accordingly, if you plan to call this crate or not depending on what
features are enabled in the build you’ll also need to control your use of
this crate via cfg attribute, not cfg macro.
Modules§
- naming_
conventions - An explanation of the crate’s naming conventions.
Macros§
- cmp_op
avx - Turns a comparison operator token to the correct constant value.
- round_
op avx - Turns a round operator token to the correct constant value.
Structs§
- m128
- The data for a 128-bit SSE register of four
f32lanes. - m256
- The data for a 256-bit AVX register of eight
f32lanes. - m512
- The data for a 512-bit AVX-512 register of sixteen
f32lanes. - m128d
- The data for a 128-bit SSE register of two
f64values. - m128i
- The data for a 128-bit SSE register of integer data.
- m256d
- The data for a 256-bit AVX register of four
f64values. - m256i
- The data for a 256-bit AVX register of integer data.
- m512d
- The data for a 512-bit AVX-512 register of eight
f64values. - m512i
- The data for a 512-bit AVX-512 register of integer data.
Constants§
- STR_
CMP_ BIT_ MASK - Return the bitwise mask of matches.
- STR_
CMP_ EQ_ ANY - Matches when any haystack character equals any needle character, regardless of position.
- STR_
CMP_ EQ_ EACH - Matches when a character position in the needle is equal to the character at the same position in the haystack.
- STR_
CMP_ EQ_ ORDERED - Matches when the complete needle string is a substring somewhere in the haystack.
- STR_
CMP_ FIRST_ MATCH - Return the index of the first match found.
- STR_
CMP_ I8 - string segment elements are i8 values
- STR_
CMP_ I16 - string segment elements are i16 values
- STR_
CMP_ LAST_ MATCH - Return the index of the last match found.
- STR_
CMP_ RANGES - Interprets consecutive pairs of characters in the needle as
(low..=high)ranges to compare each haystack character to. - STR_
CMP_ U8 - string segment elements are u8 values
- STR_
CMP_ U16 - string segment elements are u16 values
- STR_
CMP_ UNIT_ MASK - Return the lanewise mask of matches.
Functions§
- abs_
i8_ m128i ssse3 - Lanewise absolute value with lanes as
i8. - abs_
i8_ m256i avx2 - Absolute value of
i8lanes. - abs_
i16_ m128i ssse3 - Lanewise absolute value with lanes as
i16. - abs_
i16_ m256i avx2 - Absolute value of
i16lanes. - abs_
i32_ m128i ssse3 - Lanewise absolute value with lanes as
i32. - abs_
i32_ m256i avx2 - Absolute value of
i32lanes. - add_
carry_ u32 adx - Add two
u32with a carry value. - add_
carry_ u64 adx - Add two
u64with a carry value. - add_
horizontal_ i16_ m128i ssse3 - Add horizontal pairs of
i16values, pack the outputs asathenb. - add_
horizontal_ i16_ m256i avx2 - Horizontal
a + bwith lanes asi16. - add_
horizontal_ i32_ m128i ssse3 - Add horizontal pairs of
i32values, pack the outputs asathenb. - add_
horizontal_ i32_ m256i avx2 - Horizontal
a + bwith lanes asi32. - add_
horizontal_ m128 sse3 - Add each lane horizontally, pack the outputs as
athenb. - add_
horizontal_ m256 avx - Add adjacent
f32lanes. - add_
horizontal_ m128d sse3 - Add each lane horizontally, pack the outputs as
athenb. - add_
horizontal_ m256d avx - Add adjacent
f64lanes. - add_
horizontal_ saturating_ i16_ m128i ssse3 - Add horizontal pairs of
i16values, saturating, pack the outputs asathenb. - add_
horizontal_ saturating_ i16_ m256i avx2 - Horizontal saturating
a + bwith lanes asi16. - add_
i8_ m128i sse2 - Lanewise
a + bwith lanes asi8. - add_
i8_ m256i avx2 - Lanewise
a + bwith lanes asi8. - add_
i16_ m128i sse2 - Lanewise
a + bwith lanes asi16. - add_
i16_ m256i avx2 - Lanewise
a + bwith lanes asi16. - add_
i32_ m128i sse2 - Lanewise
a + bwith lanes asi32. - add_
i32_ m256i avx2 - Lanewise
a + bwith lanes asi32. - add_
i64_ m128i sse2 - Lanewise
a + bwith lanes asi64. - add_
i64_ m256i avx2 - Lanewise
a + bwith lanes asi64. - add_
m128 sse - Lanewise
a + b. - add_
m256 avx - Lanewise
a + bwithf32lanes. - add_
m128_ s sse - Low lane
a + b, other lanes unchanged. - add_
m128d sse2 - Lanewise
a + b. - add_
m128d_ s sse2 - Lowest lane
a + b, high lane unchanged. - add_
m256d avx - Lanewise
a + bwithf64lanes. - add_
saturating_ i8_ m128i sse2 - Lanewise saturating
a + bwith lanes asi8. - add_
saturating_ i8_ m256i avx2 - Lanewise saturating
a + bwith lanes asi8. - add_
saturating_ i16_ m128i sse2 - Lanewise saturating
a + bwith lanes asi16. - add_
saturating_ i16_ m256i avx2 - Lanewise saturating
a + bwith lanes asi16. - add_
saturating_ u8_ m128i sse2 - Lanewise saturating
a + bwith lanes asu8. - add_
saturating_ u8_ m256i avx2 - Lanewise saturating
a + bwith lanes asu8. - add_
saturating_ u16_ m128i sse2 - Lanewise saturating
a + bwith lanes asu16. - add_
saturating_ u16_ m256i avx2 - Lanewise saturating
a + bwith lanes asu16. - addsub_
m128 sse3 - Alternately, from the top, add a lane and then subtract a lane.
- addsub_
m256 avx - Alternately, from the top, add
f32then subf32. - addsub_
m128d sse3 - Add the high lane and subtract the low lane.
- addsub_
m256d avx - Alternately, from the top, add
f64then subf64. - aes_
decrypt_ last_ m128i aes - Perform the last round of an AES decryption flow on
ausing theround_key. - aes_
decrypt_ m128i aes - Perform one round of an AES decryption flow on
ausing theround_key. - aes_
encrypt_ last_ m128i aes - Perform the last round of an AES encryption flow on
ausing theround_key. - aes_
encrypt_ m128i aes - Perform one round of an AES encryption flow on
ausing theround_key. - aes_
inv_ mix_ columns_ m128i aes - Perform the InvMixColumns transform on
a. - aes_
key_ gen_ assist_ m128i aes - Assist in expanding an AES cipher key.
- average_
u8_ m128i sse2 - Lanewise average of the
u8values. - average_
u8_ m256i avx2 - Average
u8lanes. - average_
u16_ m128i sse2 - Lanewise average of the
u16values. - average_
u16_ m256i avx2 - Average
u16lanes. - bit_
extract2_ u32 bmi1 - Extract a span of bits from the
u32, control value style. - bit_
extract2_ u64 bmi1 - Extract a span of bits from the
u64, control value style. - bit_
extract_ u32 bmi1 - Extract a span of bits from the
u32, start and len style. - bit_
extract_ u64 bmi1 - Extract a span of bits from the
u64, start and len style. - bit_
lowest_ set_ mask_ u32 bmi1 - Gets the mask of all bits up to and including the lowest set bit in a
u32. - bit_
lowest_ set_ mask_ u64 bmi1 - Gets the mask of all bits up to and including the lowest set bit in a
u64. - bit_
lowest_ set_ reset_ u32 bmi1 - Resets (clears) the lowest set bit.
- bit_
lowest_ set_ reset_ u64 bmi1 - Resets (clears) the lowest set bit.
- bit_
lowest_ set_ value_ u32 bmi1 - Gets the value of the lowest set bit in a
u32. - bit_
lowest_ set_ value_ u64 bmi1 - Gets the value of the lowest set bit in a
u64. - bit_
zero_ high_ index_ u32 bmi2 - Zero out all high bits in a
u32starting at the index given. - bit_
zero_ high_ index_ u64 bmi2 - Zero out all high bits in a
u64starting at the index given. - bitand_
m128 sse - Bitwise
a & b. - bitand_
m256 avx - Bitwise
a & b. - bitand_
m128d sse2 - Bitwise
a & b. - bitand_
m128i sse2 - Bitwise
a & b. - bitand_
m256d avx - Bitwise
a & b. - bitand_
m256i avx2 - Bitwise
a & b. - bitandnot_
m128 sse - Bitwise
(!a) & b. - bitandnot_
m256 avx - Bitwise
(!a) & b. - bitandnot_
m128d sse2 - Bitwise
(!a) & b. - bitandnot_
m128i sse2 - Bitwise
(!a) & b. - bitandnot_
m256d avx - Bitwise
(!a) & b. - bitandnot_
m256i avx2 - Bitwise
(!a) & b. - bitandnot_
u32 bmi1 - Bitwise
(!a) & bforu32 - bitandnot_
u64 bmi1 - Bitwise
(!a) & bforu64 - bitor_
m128 sse - Bitwise
a | b. - bitor_
m256 avx - Bitwise
a | b. - bitor_
m128d sse2 - Bitwise
a | b. - bitor_
m128i sse2 - Bitwise
a | b. - bitor_
m256d avx - Bitwise
a | b. - bitor_
m256i avx2 - Bitwise
a | b - bitxor_
m128 sse - Bitwise
a ^ b. - bitxor_
m256 avx - Bitwise
a ^ b. - bitxor_
m128d sse2 - Bitwise
a ^ b. - bitxor_
m128i sse2 - Bitwise
a ^ b. - bitxor_
m256d avx - Bitwise
a ^ b. - bitxor_
m256i avx2 - Bitwise
a ^ b. - blend_
imm_ i16_ m128i sse4.1 - Blends the
i16lanes according to the immediate mask. - blend_
imm_ i16_ m256i avx2 - Blends the
i16lanes according to the immediate value. - blend_
imm_ i32_ m128i avx2 - Blends the
i32lanes inaandbinto a single value. - blend_
imm_ i32_ m256i avx2 - Blends the
i32lanes according to the immediate value. - blend_
imm_ m128 sse4.1 - Blends the lanes according to the immediate mask.
- blend_
imm_ m128d sse4.1 - Blends the
i16lanes according to the immediate mask. - blend_
m256 avx - Blends the
f32lanes according to the immediate mask. - blend_
m256d avx - Blends the
f64lanes according to the immediate mask. - blend_
varying_ i8_ m128i sse4.1 - Blend the
i8lanes according to a runtime varying mask. - blend_
varying_ i8_ m256i avx2 - Blend
i8lanes according to a runtime varying mask. - blend_
varying_ m128 sse4.1 - Blend the lanes according to a runtime varying mask.
- blend_
varying_ m256 avx - Blend the lanes according to a runtime varying mask.
- blend_
varying_ m128d sse4.1 - Blend the lanes according to a runtime varying mask.
- blend_
varying_ m256d avx - Blend the lanes according to a runtime varying mask.
- byte_
shl_ imm_ u128_ m128i sse2 - Shifts all bits in the entire register left by a number of bytes.
- byte_
shl_ imm_ u128_ m256i avx2 - Shifts each
u128lane left by a number of bytes. - byte_
shr_ imm_ u128_ m128i sse2 - Shifts all bits in the entire register right by a number of bytes.
- byte_
shr_ imm_ u128_ m256i avx2 - Shifts each
u128lane right by a number of bytes. - byte_
swap_ i32 - Swap the bytes of the given 32-bit value.
- byte_
swap_ i64 x86-64 - Swap the bytes of the given 64-bit value.
- cast_
to_ m128_ from_ m256 avx - Bit-preserving cast to
m128fromm256. - cast_
to_ m128_ from_ m128d sse2 - Bit-preserving cast to
m128fromm128d - cast_
to_ m128_ from_ m128i sse2 - Bit-preserving cast to
m128fromm128i - cast_
to_ m128d_ from_ m128 sse2 - Bit-preserving cast to
m128dfromm128 - cast_
to_ m128d_ from_ m128i sse2 - Bit-preserving cast to
m128dfromm128i - cast_
to_ m128d_ from_ m256d avx - Bit-preserving cast to
m128dfromm256d. - cast_
to_ m128i_ from_ m128 sse2 - Bit-preserving cast to
m128ifromm128 - cast_
to_ m128i_ from_ m128d sse2 - Bit-preserving cast to
m128ifromm128d - cast_
to_ m128i_ from_ m256i avx - Bit-preserving cast to
m128ifromm256i. - cast_
to_ m256_ from_ m256d avx - Bit-preserving cast to
m256fromm256d. - cast_
to_ m256_ from_ m256i avx - Bit-preserving cast to
m256fromm256i. - cast_
to_ m256d_ from_ m256 avx - Bit-preserving cast to
m256ifromm256. - cast_
to_ m256d_ from_ m256i avx - Bit-preserving cast to
m256dfromm256i. - cast_
to_ m256i_ from_ m256 avx - Bit-preserving cast to
m256ifromm256. - cast_
to_ m256i_ from_ m256d avx - Bit-preserving cast to
m256ifromm256d. - ceil_
m128 sse4.1 - Round each lane to a whole number, towards positive infinity.
- ceil_
m256 avx - Round
f32lanes towards positive infinity. - ceil_
m128_ s sse4.1 - Round the low lane of
btoward positive infinity, other lanesa. - ceil_
m128d sse4.1 - Round each lane to a whole number, towards positive infinity.
- ceil_
m128d_ s sse4.1 - Round the low lane of
btoward positive infinity, high lane isa. - ceil_
m256d avx - Round
f64lanes towards positive infinity. - cmp_
eq_ i32_ m128_ s sse - Low lane equality.
- cmp_
eq_ i32_ m128d_ s sse2 - Low lane
f64equal to. - cmp_
eq_ mask_ i8_ m128i sse2 - Lanewise
a == bwith lanes asi8. - cmp_
eq_ mask_ i8_ m256i avx2 - Compare
i8lanes for equality, mask output. - cmp_
eq_ mask_ i16_ m128i sse2 - Lanewise
a == bwith lanes asi16. - cmp_
eq_ mask_ i16_ m256i avx2 - Compare
i16lanes for equality, mask output. - cmp_
eq_ mask_ i32_ m128i sse2 - Lanewise
a == bwith lanes asi32. - cmp_
eq_ mask_ i32_ m256i avx2 - Compare
i32lanes for equality, mask output. - cmp_
eq_ mask_ i64_ m128i sse4.1 - Lanewise
a == bwith lanes asi64. - cmp_
eq_ mask_ i64_ m256i avx2 - Compare
i64lanes for equality, mask output. - cmp_
eq_ mask_ m128 sse - Lanewise
a == b. - cmp_
eq_ mask_ m128_ s sse - Low lane
a == b, other lanes unchanged. - cmp_
eq_ mask_ m128d sse2 - Lanewise
a == b, mask output. - cmp_
eq_ mask_ m128d_ s sse2 - Low lane
a == b, other lanes unchanged. - cmp_
ge_ i32_ m128_ s sse - Low lane greater than or equal to.
- cmp_
ge_ i32_ m128d_ s sse2 - Low lane
f64greater than or equal to. - cmp_
ge_ mask_ m128 sse - Lanewise
a >= b. - cmp_
ge_ mask_ m128_ s sse - Low lane
a >= b, other lanes unchanged. - cmp_
ge_ mask_ m128d sse2 - Lanewise
a >= b. - cmp_
ge_ mask_ m128d_ s sse2 - Low lane
a >= b, other lanes unchanged. - cmp_
gt_ i32_ m128_ s sse - Low lane greater than.
- cmp_
gt_ i32_ m128d_ s sse2 - Low lane
f64greater than. - cmp_
gt_ mask_ i8_ m128i sse2 - Lanewise
a > bwith lanes asi8. - cmp_
gt_ mask_ i8_ m256i avx2 - Compare
i8lanes fora > b, mask output. - cmp_
gt_ mask_ i16_ m128i sse2 - Lanewise
a > bwith lanes asi16. - cmp_
gt_ mask_ i16_ m256i avx2 - Compare
i16lanes fora > b, mask output. - cmp_
gt_ mask_ i32_ m128i sse2 - Lanewise
a > bwith lanes asi32. - cmp_
gt_ mask_ i32_ m256i avx2 - Compare
i32lanes fora > b, mask output. - cmp_
gt_ mask_ i64_ m128i sse4.2 - Lanewise
a > bwith lanes asi64. - cmp_
gt_ mask_ i64_ m256i avx2 - Compare
i64lanes fora > b, mask output. - cmp_
gt_ mask_ m128 sse - Lanewise
a > b. - cmp_
gt_ mask_ m128_ s sse - Low lane
a > b, other lanes unchanged. - cmp_
gt_ mask_ m128d sse2 - Lanewise
a > b. - cmp_
gt_ mask_ m128d_ s sse2 - Low lane
a > b, other lanes unchanged. - cmp_
le_ i32_ m128_ s sse - Low lane less than or equal to.
- cmp_
le_ i32_ m128d_ s sse2 - Low lane
f64less than or equal to. - cmp_
le_ mask_ m128 sse - Lanewise
a <= b. - cmp_
le_ mask_ m128_ s sse - Low lane
a <= b, other lanes unchanged. - cmp_
le_ mask_ m128d sse2 - Lanewise
a <= b. - cmp_
le_ mask_ m128d_ s sse2 - Low lane
a <= b, other lanes unchanged. - cmp_
lt_ i32_ m128_ s sse - Low lane less than.
- cmp_
lt_ i32_ m128d_ s sse2 - Low lane
f64less than. - cmp_
lt_ mask_ i8_ m128i sse2 - Lanewise
a < bwith lanes asi8. - cmp_
lt_ mask_ i16_ m128i sse2 - Lanewise
a < bwith lanes asi16. - cmp_
lt_ mask_ i32_ m128i sse2 - Lanewise
a < bwith lanes asi32. - cmp_
lt_ mask_ m128 sse - Lanewise
a < b. - cmp_
lt_ mask_ m128_ s sse - Low lane
a < b, other lanes unchanged. - cmp_
lt_ mask_ m128d sse2 - Lanewise
a < b. - cmp_
lt_ mask_ m128d_ s sse2 - Low lane
a < b, other lane unchanged. - cmp_
neq_ i32_ m128_ s sse - Low lane not equal to.
- cmp_
neq_ i32_ m128d_ s sse2 - Low lane
f64less than. - cmp_
neq_ mask_ m128 sse - Lanewise
a != b. - cmp_
neq_ mask_ m128_ s sse - Low lane
a != b, other lanes unchanged. - cmp_
neq_ mask_ m128d sse2 - Lanewise
a != b. - cmp_
neq_ mask_ m128d_ s sse2 - Low lane
a != b, other lane unchanged. - cmp_
nge_ mask_ m128 sse - Lanewise
!(a >= b). - cmp_
nge_ mask_ m128_ s sse - Low lane
!(a >= b), other lanes unchanged. - cmp_
nge_ mask_ m128d sse2 - Lanewise
!(a >= b). - cmp_
nge_ mask_ m128d_ s sse2 - Low lane
!(a >= b), other lane unchanged. - cmp_
ngt_ mask_ m128 sse - Lanewise
!(a > b). - cmp_
ngt_ mask_ m128_ s sse - Low lane
!(a > b), other lanes unchanged. - cmp_
ngt_ mask_ m128d sse2 - Lanewise
!(a > b). - cmp_
ngt_ mask_ m128d_ s sse2 - Low lane
!(a > b), other lane unchanged. - cmp_
nle_ mask_ m128 sse - Lanewise
!(a <= b). - cmp_
nle_ mask_ m128_ s sse - Low lane
!(a <= b), other lanes unchanged. - cmp_
nle_ mask_ m128d sse2 - Lanewise
!(a <= b). - cmp_
nle_ mask_ m128d_ s sse2 - Low lane
!(a <= b), other lane unchanged. - cmp_
nlt_ mask_ m128 sse - Lanewise
!(a < b). - cmp_
nlt_ mask_ m128_ s sse - Low lane
!(a < b), other lanes unchanged. - cmp_
nlt_ mask_ m128d sse2 - Lanewise
!(a < b). - cmp_
nlt_ mask_ m128d_ s sse2 - Low lane
!(a < b), other lane unchanged. - cmp_
op_ mask_ m128 avx - Compare
f32lanes according to the operation specified, mask output. - cmp_
op_ mask_ m256 avx - Compare
f32lanes according to the operation specified, mask output. - cmp_
op_ mask_ m128_ s avx - Compare
f32lanes according to the operation specified, mask output. - cmp_
op_ mask_ m128d avx - Compare
f64lanes according to the operation specified, mask output. - cmp_
op_ mask_ m128d_ s avx - Compare
f64lanes according to the operation specified, mask output. - cmp_
op_ mask_ m256d avx - Compare
f64lanes according to the operation specified, mask output. - cmp_
ordered_ mask_ m128 sse - Lanewise
(!a.is_nan()) & (!b.is_nan()). - cmp_
ordered_ mask_ m128_ s sse - Low lane
(!a.is_nan()) & (!b.is_nan()), other lanes unchanged. - cmp_
ordered_ mask_ m128d sse2 - Lanewise
(!a.is_nan()) & (!b.is_nan()). - cmp_
ordered_ mask_ m128d_ s sse2 - Low lane
(!a.is_nan()) & (!b.is_nan()), other lane unchanged. - cmp_
unord_ mask_ m128 sse - Lanewise
a.is_nan() | b.is_nan(). - cmp_
unord_ mask_ m128_ s sse - Low lane
a.is_nan() | b.is_nan(), other lanes unchanged. - cmp_
unord_ mask_ m128d sse2 - Lanewise
a.is_nan() | b.is_nan(). - cmp_
unord_ mask_ m128d_ s sse2 - Low lane
a.is_nan() | b.is_nan(), other lane unchanged. - combined_
byte_ shr_ imm_ m128i ssse3 - Counts
$aas the high bytes and$bas the low bytes then performs a byte shift to the right by the immediate value. - combined_
byte_ shr_ imm_ m256i avx2 - Works like
combined_byte_shr_imm_m128i, but twice as wide. - convert_
i32_ replace_ m128_ s sse - Convert
i32tof32and replace the low lane of the input. - convert_
i32_ replace_ m128d_ s sse2 - Convert
i32tof64and replace the low lane of the input. - convert_
i64_ replace_ m128_ s sse - Convert
i64tof32and replace the low lane of the input. - convert_
i64_ replace_ m128d_ s sse2 - Convert
i64tof64and replace the low lane of the input. - convert_
m128_ s_ replace_ m128d_ s sse2 - Converts the lower
f32tof64and replace the low lane of the input - convert_
m128d_ s_ replace_ m128_ s sse2 - Converts the low
f64tof32and replaces the low lane of the input. - convert_
to_ f32_ from_ m256_ s avx - Convert the lowest
f32lane to a singlef32. - convert_
to_ f64_ from_ m256d_ s avx - Convert the lowest
f64lane to a singlef64. - convert_
to_ i16_ m128i_ from_ lower2_ i16_ m128i sse4.1 - Convert the lower two
i64lanes to twoi32lanes. - convert_
to_ i16_ m128i_ from_ lower8_ i8_ m128i sse4.1 - Convert the lower eight
i8lanes to eighti16lanes. - convert_
to_ i16_ m256i_ from_ i8_ m128i avx2 - Convert
i8values toi16values. - convert_
to_ i16_ m256i_ from_ lower4_ u8_ m128i avx2 - Convert lower 4
u8values toi16values. - convert_
to_ i16_ m256i_ from_ lower8_ u8_ m128i avx2 - Convert lower 8
u8values toi16values. - convert_
to_ i16_ m256i_ from_ u8_ m128i avx2 - Convert
u8values toi16values. - convert_
to_ i32_ from_ m256i_ s avx - Convert the lowest
i32lane to a singlei32. - convert_
to_ i32_ m128i_ from_ lower4_ i8_ m128i sse4.1 - Convert the lower four
i8lanes to fouri32lanes. - convert_
to_ i32_ m128i_ from_ lower4_ i16_ m128i sse4.1 - Convert the lower four
i16lanes to fouri32lanes. - convert_
to_ i32_ m128i_ from_ m128 sse2 - Rounds the
f32lanes toi32lanes. - convert_
to_ i32_ m128i_ from_ m128d sse2 - Rounds the two
f64lanes to the low twoi32lanes. - convert_
to_ i32_ m128i_ from_ m256d avx - Convert
f64lanes to bei32lanes. - convert_
to_ i32_ m256i_ from_ i16_ m128i avx2 - Convert
i16values toi32values. - convert_
to_ i32_ m256i_ from_ lower8_ i8_ m128i avx2 - Convert the lower 8
i8values toi32values. - convert_
to_ i32_ m256i_ from_ m256 avx - Convert
f32lanes to bei32lanes. - convert_
to_ i32_ m256i_ from_ u16_ m128i avx2 - Convert
u16values toi32values. - convert_
to_ i64_ m128i_ from_ lower2_ i8_ m128i sse4.1 - Convert the lower two
i8lanes to twoi64lanes. - convert_
to_ i64_ m128i_ from_ lower2_ i32_ m128i sse4.1 - Convert the lower two
i32lanes to twoi64lanes. - convert_
to_ i64_ m256i_ from_ i32_ m128i avx2 - Convert
i32values toi64values. - convert_
to_ i64_ m256i_ from_ lower4_ i8_ m128i avx2 - Convert the lower 4
i8values toi64values. - convert_
to_ i64_ m256i_ from_ lower4_ i16_ m128i avx2 - Convert
i16values toi64values. - convert_
to_ i64_ m256i_ from_ lower4_ u16_ m128i avx2 - Convert
u16values toi64values. - convert_
to_ i64_ m256i_ from_ u32_ m128i avx2 - Convert
u32values toi64values. - convert_
to_ m128_ from_ i32_ m128i sse2 - Rounds the four
i32lanes to fourf32lanes. - convert_
to_ m128_ from_ m128d sse2 - Rounds the two
f64lanes to the low twof32lanes. - convert_
to_ m128_ from_ m256d avx - Convert
f64lanes to bef32lanes. - convert_
to_ m128d_ from_ lower2_ i32_ m128i sse2 - Rounds the lower two
i32lanes to twof64lanes. - convert_
to_ m128d_ from_ lower2_ m128 sse2 - Rounds the two
f64lanes to the low twof32lanes. - convert_
to_ m256_ from_ i32_ m256i avx - Convert
i32lanes to bef32lanes. - convert_
to_ m256d_ from_ i32_ m128i avx - Convert
i32lanes to bef64lanes. - convert_
to_ m256d_ from_ m128 avx - Convert
f32lanes to bef64lanes. - convert_
to_ u16_ m128i_ from_ lower8_ u8_ m128i sse4.1 - Convert the lower eight
u8lanes to eightu16lanes. - convert_
to_ u32_ m128i_ from_ lower4_ u8_ m128i sse4.1 - Convert the lower four
u8lanes to fouru32lanes. - convert_
to_ u32_ m128i_ from_ lower4_ u16_ m128i sse4.1 - Convert the lower four
u16lanes to fouru32lanes. - convert_
to_ u64_ m128i_ from_ lower2_ u8_ m128i sse4.1 - Convert the lower two
u8lanes to twou64lanes. - convert_
to_ u64_ m128i_ from_ lower2_ u16_ m128i sse4.1 - Convert the lower two
u16lanes to twou64lanes. - convert_
to_ u64_ m128i_ from_ lower2_ u32_ m128i sse4.1 - Convert the lower two
u32lanes to twou64lanes. - convert_
truncate_ to_ i32_ m128i_ from_ m256d avx - Convert
f64lanes toi32lanes with truncation. - convert_
truncate_ to_ i32_ m256i_ from_ m256 avx - Convert
f32lanes toi32lanes with truncation. - copy_
i64_ m128i_ s sse2 - Copy the low
i64lane to a new register, upper bits 0. - copy_
replace_ low_ f64_ m128d sse2 - Copies the
avalue and replaces the low lane with the lowbvalue. - crc32_
u8 sse4.2 - Accumulates the
u8into a running CRC32 value. - crc32_
u16 sse4.2 - Accumulates the
u16into a running CRC32 value. - crc32_
u32 sse4.2 - Accumulates the
u32into a running CRC32 value. - crc32_
u64 sse4.2 - Accumulates the
u64into a running CRC32 value. - div_
m128 sse - Lanewise
a / b. - div_
m256 avx - Lanewise
a / bwithf32. - div_
m128_ s sse - Low lane
a / b, other lanes unchanged. - div_
m128d sse2 - Lanewise
a / b. - div_
m128d_ s sse2 - Lowest lane
a / b, high lane unchanged. - div_
m256d avx - Lanewise
a / bwithf64. - dot_
product_ m128 sse4.1 - Performs a dot product of two
m128registers. - dot_
product_ m256 avx - This works like
dot_product_m128, but twice as wide. - dot_
product_ m128d sse4.1 - Performs a dot product of two
m128dregisters. - duplicate_
even_ lanes_ m128 sse3 - Duplicate the odd lanes to the even lanes.
- duplicate_
even_ lanes_ m256 avx - Duplicate the even-indexed lanes to the odd lanes.
- duplicate_
low_ lane_ m128d_ s sse3 - Copy the low lane of the input to both lanes of the output.
- duplicate_
odd_ lanes_ m128 sse3 - Duplicate the odd lanes to the even lanes.
- duplicate_
odd_ lanes_ m256 avx - Duplicate the odd-indexed lanes to the even lanes.
- duplicate_
odd_ lanes_ m256d avx - Duplicate the odd-indexed lanes to the even lanes.
- extract_
f32_ as_ i32_ bits_ imm_ m128 sse4.1 - Gets the
f32lane requested. Returns as ani32bit pattern. - extract_
i8_ as_ i32_ imm_ m128i sse4.1 - Gets the
i8lane requested. Only the lowest 4 bits are considered. - extract_
i8_ as_ i32_ m256i avx2 - Gets an
i8value out of anm256i, returns asi32. - extract_
i16_ as_ i32_ m128i sse2 - Gets an
i16value out of anm128i, returns asi32. - extract_
i16_ as_ i32_ m256i avx2 - Gets an
i16value out of anm256i, returns asi32. - extract_
i32_ from_ m256i avx - Extracts an
i32lane fromm256i - extract_
i32_ imm_ m128i sse4.1 - Gets the
i32lane requested. Only the lowest 2 bits are considered. - extract_
i64_ from_ m256i avx - Extracts an
i64lane fromm256i - extract_
i64_ imm_ m128i sse4.1 - Gets the
i64lane requested. Only the lowest bit is considered. - extract_
m128_ from_ m256 avx - Extracts an
m128fromm256 - extract_
m128d_ from_ m256d avx - Extracts an
m128dfromm256d - extract_
m128i_ from_ m256i avx - Extracts an
m128ifromm256i - extract_
m128i_ m256i avx2 - Gets an
m128ivalue out of anm256i. - floor_
m128 sse4.1 - Round each lane to a whole number, towards negative infinity
- floor_
m256 avx - Round
f32lanes towards negative infinity. - floor_
m128_ s sse4.1 - Round the low lane of
btoward negative infinity, other lanesa. - floor_
m128d sse4.1 - Round each lane to a whole number, towards negative infinity
- floor_
m128d_ s sse4.1 - Round the low lane of
btoward negative infinity, high lane isa. - floor_
m256d avx - Round
f64lanes towards negative infinity. - fused_
mul_ add_ m128 fma - Lanewise fused
(a * b) + c - fused_
mul_ add_ m256 fma - Lanewise fused
(a * b) + c - fused_
mul_ add_ m128_ s fma - Low lane fused
(a * b) + c, other lanes unchanged - fused_
mul_ add_ m128d fma - Lanewise fused
(a * b) + c - fused_
mul_ add_ m128d_ s fma - Low lane fused
(a * b) + c, other lanes unchanged - fused_
mul_ add_ m256d fma - Lanewise fused
(a * b) + c - fused_
mul_ addsub_ m128 fma - Lanewise fused
(a * b) addsub c(adds odd lanes and subtracts even lanes) - fused_
mul_ addsub_ m256 fma - Lanewise fused
(a * b) addsub c(adds odd lanes and subtracts even lanes) - fused_
mul_ addsub_ m128d fma - Lanewise fused
(a * b) addsub c(adds odd lanes and subtracts even lanes) - fused_
mul_ addsub_ m256d fma - Lanewise fused
(a * b) addsub c(adds odd lanes and subtracts even lanes) - fused_
mul_ neg_ add_ m128 fma - Lanewise fused
-(a * b) + c - fused_
mul_ neg_ add_ m256 fma - Lanewise fused
-(a * b) + c - fused_
mul_ neg_ add_ m128_ s fma - Low lane
-(a * b) + c, other lanes unchanged. - fused_
mul_ neg_ add_ m128d fma - Lanewise fused
-(a * b) + c - fused_
mul_ neg_ add_ m128d_ s fma - Low lane
-(a * b) + c, other lanes unchanged. - fused_
mul_ neg_ add_ m256d fma - Lanewise fused
-(a * b) + c - fused_
mul_ neg_ sub_ m128 fma - Lanewise fused
-(a * b) - c - fused_
mul_ neg_ sub_ m256 fma - Lanewise fused
-(a * b) - c - fused_
mul_ neg_ sub_ m128_ s fma - Low lane fused
-(a * b) - c, other lanes unchanged. - fused_
mul_ neg_ sub_ m128d fma - Lanewise fused
-(a * b) - c - fused_
mul_ neg_ sub_ m128d_ s fma - Low lane fused
-(a * b) - c, other lanes unchanged. - fused_
mul_ neg_ sub_ m256d fma - Lanewise fused
-(a * b) - c - fused_
mul_ sub_ m128 fma - Lanewise fused
(a * b) - c - fused_
mul_ sub_ m256 fma - Lanewise fused
(a * b) - c - fused_
mul_ sub_ m128_ s fma - Low lane fused
(a * b) - c, other lanes unchanged. - fused_
mul_ sub_ m128d fma - Lanewise fused
(a * b) - c - fused_
mul_ sub_ m128d_ s fma - Low lane fused
(a * b) - c, other lanes unchanged. - fused_
mul_ sub_ m256d fma - Lanewise fused
(a * b) - c - fused_
mul_ subadd_ m128 fma - Lanewise fused
(a * b) subadd c(subtracts odd lanes and adds even lanes) - fused_
mul_ subadd_ m256 fma - Lanewise fused
(a * b) subadd c(subtracts odd lanes and adds even lanes) - fused_
mul_ subadd_ m128d fma - Lanewise fused
(a * b) subadd c(subtracts odd lanes and adds even lanes) - fused_
mul_ subadd_ m256d fma - Lanewise fused
(a * b) subadd c(subtracts odd lanes and adds even lanes) - get_
f32_ from_ m128_ s sse - Gets the low lane as an individual
f32value. - get_
f64_ from_ m128d_ s sse2 - Gets the lower lane as an
f64value. - get_
i32_ from_ m128_ s sse - Converts the low lane to
i32and extracts as an individual value. - get_
i32_ from_ m128d_ s sse2 - Converts the lower lane to an
i32value. - get_
i32_ from_ m128i_ s sse2 - Converts the lower lane to an
i32value. - get_
i64_ from_ m128_ s sse - Converts the low lane to
i64and extracts as an individual value. - get_
i64_ from_ m128d_ s sse2 - Converts the lower lane to an
i64value. - get_
i64_ from_ m128i_ s sse2 - Converts the lower lane to an
i64value. - insert_
f32_ imm_ m128 sse4.1 - Inserts a lane from
$binto$a, optionally at a new position. - insert_
i8_ imm_ m128i sse4.1 - Inserts a new value for the
i64lane specified. - insert_
i8_ to_ m256i avx - Inserts an
i8tom256i - insert_
i16_ from_ i32_ m128i sse2 - Inserts the low 16 bits of an
i32value into anm128i. - insert_
i16_ to_ m256i avx - Inserts an
i16tom256i - insert_
i32_ imm_ m128i sse4.1 - Inserts a new value for the
i32lane specified. - insert_
i32_ to_ m256i avx - Inserts an
i32tom256i - insert_
i64_ imm_ m128i sse4.1 - Inserts a new value for the
i64lane specified. - insert_
i64_ to_ m256i avx - Inserts an
i64tom256i - insert_
m128_ to_ m256 avx - Inserts an
m128tom256 - insert_
m128d_ to_ m256d avx - Inserts an
m128dtom256d - insert_
m128i_ to_ m256i avx2 - Inserts an
m128ito anm256iat the high or low position. - insert_
m128i_ to_ m256i_ slow_ avx avx - Slowly inserts an
m128itom256i. - leading_
zero_ count_ u32 lzcnt - Count the leading zeroes in a
u32. - leading_
zero_ count_ u64 lzcnt - Count the leading zeroes in a
u64. - load_
f32_ m128_ s sse - Loads the
f32reference into the low lane of the register. - load_
f32_ splat_ m128 sse - Loads the
f32reference into all lanes of a register. - load_
f32_ splat_ m256 avx - Load an
f32and splat it to all lanes of anm256d - load_
f64_ m128d_ s sse2 - Loads the reference into the low lane of the register.
- load_
f64_ splat_ m128d sse2 - Loads the
f64reference into all lanes of a register. - load_
f64_ splat_ m256d avx - Load an
f64and splat it to all lanes of anm256d - load_
i64_ m128i_ s sse2 - Loads the low
i64into a register. - load_
m128 sse - Loads the reference into a register.
- load_
m256 avx - Load data from memory into a register.
- load_
m128_ splat_ m256 avx - Load an
m128and splat it to the lower and upper half of anm256 - load_
m128d sse2 - Loads the reference into a register.
- load_
m128d_ splat_ m256d avx - Load an
m128dand splat it to the lower and upper half of anm256d - load_
m128i sse2 - Loads the reference into a register.
- load_
m256d avx - Load data from memory into a register.
- load_
m256i avx - Load data from memory into a register.
- load_
masked_ i32_ m128i avx2 - Loads the reference given and zeroes any
i32lanes not in the mask. - load_
masked_ i32_ m256i avx2 - Loads the reference given and zeroes any
i32lanes not in the mask. - load_
masked_ i64_ m128i avx2 - Loads the reference given and zeroes any
i64lanes not in the mask. - load_
masked_ i64_ m256i avx2 - Loads the reference given and zeroes any
i64lanes not in the mask. - load_
masked_ m128 avx - Load data from memory into a register according to a mask.
- load_
masked_ m256 avx - Load data from memory into a register according to a mask.
- load_
masked_ m128d avx - Load data from memory into a register according to a mask.
- load_
masked_ m256d avx - Load data from memory into a register according to a mask.
- load_
replace_ high_ m128d sse2 - Loads the reference into a register, replacing the high lane.
- load_
replace_ low_ m128d sse2 - Loads the reference into a register, replacing the low lane.
- load_
reverse_ m128 sse - Loads the reference into a register with reversed order.
- load_
reverse_ m128d sse2 - Loads the reference into a register with reversed order.
- load_
unaligned_ hi_ lo_ m256 avx - Load data from memory into a register.
- load_
unaligned_ hi_ lo_ m256d avx - Load data from memory into a register.
- load_
unaligned_ hi_ lo_ m256i avx - Load data from memory into a register.
- load_
unaligned_ m128 sse - Loads the reference into a register.
- load_
unaligned_ m256 avx - Load data from memory into a register.
- load_
unaligned_ m128d sse2 - Loads the reference into a register.
- load_
unaligned_ m128i sse2 - Loads the reference into a register.
- load_
unaligned_ m256d avx - Load data from memory into a register.
- load_
unaligned_ m256i avx - Load data from memory into a register.
- max_
i8_ m128i sse4.1 - Lanewise
max(a, b)with lanes asi8. - max_
i8_ m256i avx2 - Lanewise
max(a, b)with lanes asi8. - max_
i16_ m128i sse2 - Lanewise
max(a, b)with lanes asi16. - max_
i16_ m256i avx2 - Lanewise
max(a, b)with lanes asi16. - max_
i32_ m128i sse4.1 - Lanewise
max(a, b)with lanes asi32. - max_
i32_ m256i avx2 - Lanewise
max(a, b)with lanes asi32. - max_
m128 sse - Lanewise
max(a, b). - max_
m256 avx - Lanewise
max(a, b). - max_
m128_ s sse - Low lane
max(a, b), other lanes unchanged. - max_
m128d sse2 - Lanewise
max(a, b). - max_
m128d_ s sse2 - Low lane
max(a, b), other lanes unchanged. - max_
m256d avx - Lanewise
max(a, b). - max_
u8_ m128i sse2 - Lanewise
max(a, b)with lanes asu8. - max_
u8_ m256i avx2 - Lanewise
max(a, b)with lanes asu8. - max_
u16_ m128i sse4.1 - Lanewise
max(a, b)with lanes asu16. - max_
u16_ m256i avx2 - Lanewise
max(a, b)with lanes asu16. - max_
u32_ m128i sse4.1 - Lanewise
max(a, b)with lanes asu32. - max_
u32_ m256i avx2 - Lanewise
max(a, b)with lanes asu32. - min_
i8_ m128i sse4.1 - Lanewise
min(a, b)with lanes asi8. - min_
i8_ m256i avx2 - Lanewise
min(a, b)with lanes asi8. - min_
i16_ m128i sse2 - Lanewise
min(a, b)with lanes asi16. - min_
i16_ m256i avx2 - Lanewise
min(a, b)with lanes asi16. - min_
i32_ m128i sse4.1 - Lanewise
min(a, b)with lanes asi32. - min_
i32_ m256i avx2 - Lanewise
min(a, b)with lanes asi32. - min_
m128 sse - Lanewise
min(a, b). - min_
m256 avx - Lanewise
min(a, b). - min_
m128_ s sse - Low lane
min(a, b), other lanes unchanged. - min_
m128d sse2 - Lanewise
min(a, b). - min_
m128d_ s sse2 - Low lane
min(a, b), other lanes unchanged. - min_
m256d avx - Lanewise
min(a, b). - min_
position_ u16_ m128i sse4.1 - Min
u16value, position, and other lanes zeroed. - min_
u8_ m128i sse2 - Lanewise
min(a, b)with lanes asu8. - min_
u8_ m256i avx2 - Lanewise
min(a, b)with lanes asu8. - min_
u16_ m128i sse4.1 - Lanewise
min(a, b)with lanes asu16. - min_
u16_ m256i avx2 - Lanewise
min(a, b)with lanes asu16. - min_
u32_ m128i sse4.1 - Lanewise
min(a, b)with lanes asu32. - min_
u32_ m256i avx2 - Lanewise
min(a, b)with lanes asu32. - move_
high_ low_ m128 sse - Move the high lanes of
bto the low lanes ofa, other lanes unchanged. - move_
low_ high_ m128 sse - Move the low lanes of
bto the high lanes ofa, other lanes unchanged. - move_
m128_ s sse - Move the low lane of
btoa, other lanes unchanged. - move_
mask_ i8_ m128i sse2 - Gathers the
i8sign bit of each lane. - move_
mask_ i8_ m256i avx2 - Create an
i32mask of each sign bit in thei8lanes. - move_
mask_ m128 sse - Gathers the sign bit of each lane.
- move_
mask_ m256 avx - Collects the sign bit of each lane into a 4-bit value.
- move_
mask_ m128d sse2 - Gathers the sign bit of each lane.
- move_
mask_ m256d avx - Collects the sign bit of each lane into a 4-bit value.
- mul_
32_ m128i sse4.1 - Lanewise
a * bwith 32-bit lanes. - mul_
extended_ u32 bmi2 - Multiply two
u32, outputting the low bits and storing the high bits in the reference. - mul_
extended_ u64 bmi2 - Multiply two
u64, outputting the low bits and storing the high bits in the reference. - mul_
i16_ horizontal_ add_ m128i sse2 - Multiply
i16lanes producingi32values, horizontal add pairs ofi32values to produce the final output. - mul_
i16_ horizontal_ add_ m256i avx2 - Multiply
i16lanes producingi32values, horizontal add pairs ofi32values to produce the final output. - mul_
i16_ keep_ high_ m128i sse2 - Lanewise
a * bwith lanes asi16, keep the high bits of thei32intermediates. - mul_
i16_ keep_ high_ m256i avx2 - Multiply the
i16lanes and keep the high half of each 32-bit output. - mul_
i16_ keep_ low_ m128i sse2 - Lanewise
a * bwith lanes asi16, keep the low bits of thei32intermediates. - mul_
i16_ keep_ low_ m256i avx2 - Multiply the
i16lanes and keep the low half of each 32-bit output. - mul_
i16_ scale_ round_ m128i ssse3 - Multiply
i16lanes intoi32intermediates, keep the high 18 bits, round by adding 1, right shift by 1. - mul_
i16_ scale_ round_ m256i avx2 - Multiply
i16lanes intoi32intermediates, keep the high 18 bits, round by adding 1, right shift by 1. - mul_
i32_ keep_ low_ m256i avx2 - Multiply the
i32lanes and keep the low half of each 64-bit output. - mul_
i64_ carryless_ m128i pclmulqdq - Performs a “carryless” multiplication of two
i64values. - mul_
i64_ low_ bits_ m256i avx2 - Multiply the lower
i32within eachi64lane,i64output. - mul_
m128 sse - Lanewise
a * b. - mul_
m256 avx - Lanewise
a * bwithf32lanes. - mul_
m128_ s sse - Low lane
a * b, other lanes unchanged. - mul_
m128d sse2 - Lanewise
a * b. - mul_
m128d_ s sse2 - Lowest lane
a * b, high lane unchanged. - mul_
m256d avx - Lanewise
a * bwithf64lanes. - mul_
u8i8_ add_ horizontal_ saturating_ m128i ssse3 - This is dumb and weird.
- mul_
u8i8_ add_ horizontal_ saturating_ m256i avx2 - This is dumb and weird.
- mul_
u16_ keep_ high_ m128i sse2 - Lanewise
a * bwith lanes asu16, keep the high bits of theu32intermediates. - mul_
u16_ keep_ high_ m256i avx2 - Multiply the
u16lanes and keep the high half of each 32-bit output. - mul_
u64_ low_ bits_ m256i avx2 - Multiply the lower
u32within eachu64lane,u64output. - mul_
widen_ i32_ odd_ m128i sse4.1 - Multiplies the odd
i32lanes and gives the widened (i64) results. - mul_
widen_ u32_ odd_ m128i sse2 - Multiplies the odd
u32lanes and gives the widened (u64) results. - multi_
packed_ sum_ abs_ diff_ u8_ m128i sse4.1 - Computes eight
u16“sum of absolute difference” values according to the bytes selected. - multi_
packed_ sum_ abs_ diff_ u8_ m256i avx2 - Computes eight
u16“sum of absolute difference” values according to the bytes selected. - pack_
i16_ to_ i8_ m128i sse2 - Saturating convert
i16toi8, and pack the values. - pack_
i16_ to_ i8_ m256i avx2 - Saturating convert
i16toi8, and pack the values. - pack_
i16_ to_ u8_ m128i sse2 - Saturating convert
i16tou8, and pack the values. - pack_
i16_ to_ u8_ m256i avx2 - Saturating convert
i16tou8, and pack the values. - pack_
i32_ to_ i16_ m128i sse2 - Saturating convert
i32toi16, and pack the values. - pack_
i32_ to_ i16_ m256i avx2 - Saturating convert
i32toi16, and pack the values. - pack_
i32_ to_ u16_ m128i sse4.1 - Saturating convert
i32tou16, and pack the values. - pack_
i32_ to_ u16_ m256i avx2 - Saturating convert
i32tou16, and pack the values. - permute2z_
m256 avx - Shuffle 128 bits of floating point data at a time from
$aand$busing an immediate control value. - permute2z_
m256d avx - Shuffle 128 bits of floating point data at a time from
aandbusing an immediate control value. - permute2z_
m256i avx - Slowly swizzle 128 bits of integer data from
aandbusing an immediate control value. - permute_
m128 avx - Shuffle the
f32lanes fromausing an immediate control value. - permute_
m256 avx - Shuffle the
f32lanes inausing an immediate control value. - permute_
m128d avx - Shuffle the
f64lanes inausing an immediate control value. - permute_
m256d avx - Shuffle the
f64lanes fromatogether using an immediate control value. - population_
count_ i32 popcnt - Count the number of bits set within an
i32 - population_
count_ i64 popcnt - Count the number of bits set within an
i64 - population_
deposit_ u32 bmi2 - Deposit contiguous low bits from a
u32according to a mask. - population_
deposit_ u64 bmi2 - Deposit contiguous low bits from a
u64according to a mask. - population_
extract_ u32 bmi2 - Extract bits from a
u32according to a mask. - population_
extract_ u64 bmi2 - Extract bits from a
u64according to a mask. - prefetch_
et0 sse - Fetches the cache line containing
addrinto all levels of the cache hierarchy, anticipating write - prefetch_
et1 sse - Fetches into L2 and higher, anticipating write
- prefetch_
nta sse - Fetch data using the non-temporal access (NTA) hint. It may be a place closer than main memory but outside of the cache hierarchy. This is used to reduce access latency without polluting the cache.
- prefetch_
t0 sse - Fetches the cache line containing
addrinto all levels of the cache hierarchy. - prefetch_
t1 sse - Fetches into L2 and higher.
- prefetch_
t2 sse - Fetches into L3 and higher or an implementation-specific choice (e.g., L2 if there is no L3).
- rdrand_
u16 rdrand - Try to obtain a random
u16from the hardware RNG. - rdrand_
u32 rdrand - Try to obtain a random
u32from the hardware RNG. - rdrand_
u64 rdrand - Try to obtain a random
u64from the hardware RNG. - rdseed_
u16 rdseed - Try to obtain a random
u16from the hardware RNG. - rdseed_
u32 rdseed - Try to obtain a random
u32from the hardware RNG. - rdseed_
u64 rdseed - Try to obtain a random
u64from the hardware RNG. - read_
timestamp_ counter - Reads the CPU’s timestamp counter value.
- read_
timestamp_ counter_ p - Reads the CPU’s timestamp counter value and store the processor signature.
- reciprocal_
m128 sse - Lanewise
1.0 / aapproximation. - reciprocal_
m256 avx - Reciprocal of
f32lanes. - reciprocal_
m128_ s sse - Low lane
1.0 / aapproximation, other lanes unchanged. - reciprocal_
sqrt_ m128 sse - Lanewise
1.0 / sqrt(a)approximation. - reciprocal_
sqrt_ m256 avx - Reciprocal of
f32lanes. - reciprocal_
sqrt_ m128_ s sse - Low lane
1.0 / sqrt(a)approximation, other lanes unchanged. - round_
m128 sse4.1 - Rounds each lane in the style specified.
- round_
m256 avx - Rounds each lane in the style specified.
- round_
m128_ s sse4.1 - Rounds
$blow as specified, other lanes use$a. - round_
m128d sse4.1 - Rounds each lane in the style specified.
- round_
m128d_ s sse4.1 - Rounds
$blow as specified, keeps$ahigh. - round_
m256d avx - Rounds each lane in the style specified.
- search_
explicit_ str_ for_ index sse4.2 - Search for
needlein `haystack, with explicit string length. - search_
explicit_ str_ for_ mask sse4.2 - Search for
needlein `haystack, with explicit string length. - search_
implicit_ str_ for_ index sse4.2 - Search for
needlein `haystack, with implicit string length. - search_
implicit_ str_ for_ mask sse4.2 - Search for
needlein `haystack, with implicit string length. - set_
i8_ m128i sse2 - Sets the args into an
m128i, first arg is the high lane. - set_
i8_ m256i avx - Set
i8args into anm256ilane. - set_
i16_ m128i sse2 - Sets the args into an
m128i, first arg is the high lane. - set_
i16_ m256i avx - Set
i16args into anm256ilane. - set_
i32_ m128i sse2 - Sets the args into an
m128i, first arg is the high lane. - set_
i32_ m128i_ s sse2 - Set an
i32as the low 32-bit lane of anm128i, other lanes blank. - set_
i32_ m256i avx - Set
i32args into anm256ilane. - set_
i64_ m128i sse2 - Sets the args into an
m128i, first arg is the high lane. - set_
i64_ m128i_ s sse2 - Set an
i64as the low 64-bit lane of anm128i, other lanes blank. - set_
i64_ m256i avx - Set
i64args into anm256ilane. - set_
m128 sse - Sets the args into an
m128, first arg is the high lane. - set_
m256 avx - Set
f32args into anm256lane. - set_
m128_ m256 avx - Set
m128args into anm256. - set_
m128_ s sse - Sets the args into an
m128, first arg is the high lane. - set_
m128d sse2 - Sets the args into an
m128d, first arg is the high lane. - set_
m128d_ m256d avx - Set
m128dargs into anm256d. - set_
m128d_ s sse2 - Sets the args into the low lane of a
m128d. - set_
m128i_ m256i avx - Set
m128iargs into anm256i. - set_
m256d avx - Set
f64args into anm256dlane. - set_
reversed_ i8_ m128i sse2 - Sets the args into an
m128i, first arg is the low lane. - set_
reversed_ i8_ m256i avx - Set
i8args into anm256ilane. - set_
reversed_ i16_ m128i sse2 - Sets the args into an
m128i, first arg is the low lane. - set_
reversed_ i16_ m256i avx - Set
i16args into anm256ilane. - set_
reversed_ i32_ m128i sse2 - Sets the args into an
m128i, first arg is the low lane. - set_
reversed_ i32_ m256i avx - Set
i32args into anm256ilane. - set_
reversed_ i64_ m256i avx - Set
i64args into anm256ilane. - set_
reversed_ m128 sse - Sets the args into an
m128, first arg is the low lane. - set_
reversed_ m256 avx - Set
f32args into anm256lane. - set_
reversed_ m128_ m256 avx - Set
m128args into anm256. - set_
reversed_ m128d sse2 - Sets the args into an
m128d, first arg is the low lane. - set_
reversed_ m128d_ m256d avx - Set
m128dargs into anm256d. - set_
reversed_ m128i_ m256i avx - Set
m128iargs into anm256i. - set_
reversed_ m256d avx - Set
f64args into anm256dlane. - set_
splat_ i8_ m128i sse2 - Splats the
i8to all lanes of them128i. - set_
splat_ i8_ m128i_ s_ m256i avx2 - Sets the lowest
i8lane of anm128ias all lanes of anm256i. - set_
splat_ i8_ m256i avx - Splat an
i8arg into anm256ilane. - set_
splat_ i16_ m128i sse2 - Splats the
i16to all lanes of them128i. - set_
splat_ i16_ m128i_ s_ m256i avx2 - Sets the lowest
i16lane of anm128ias all lanes of anm256i. - set_
splat_ i16_ m256i avx - Splat an
i16arg into anm256ilane. - set_
splat_ i32_ m128i sse2 - Splats the
i32to all lanes of them128i. - set_
splat_ i32_ m128i_ s_ m256i avx2 - Sets the lowest
i32lane of anm128ias all lanes of anm256i. - set_
splat_ i32_ m256i avx - Splat an
i32arg into anm256ilane. - set_
splat_ i64_ m128i sse2 - Splats the
i64to both lanes of them128i. - set_
splat_ i64_ m128i_ s_ m256i avx2 - Sets the lowest
i64lane of anm128ias all lanes of anm256i. - set_
splat_ i64_ m256i avx - Splat an
i64arg into anm256ilane. - set_
splat_ m128 sse - Splats the value to all lanes.
- set_
splat_ m256 avx - Splat an
f32arg into anm256lane. - set_
splat_ m128_ s_ m256 avx2 - Sets the lowest lane of an
m128as all lanes of anm256. - set_
splat_ m128d sse2 - Splats the args into both lanes of the
m128d. - set_
splat_ m128d_ s_ m256d avx2 - Sets the lowest lane of an
m128das all lanes of anm256d. - set_
splat_ m256d avx - Splat an
f64arg into anm256dlane. - shl_
all_ u16_ m128i sse2 - Shift all
u16lanes to the left by thecountin the loweru64lane. - shl_
all_ u16_ m256i avx2 - Lanewise
u16shift left by the loweru64lane ofcount. - shl_
all_ u32_ m128i sse2 - Shift all
u32lanes to the left by thecountin the loweru64lane. - shl_
all_ u32_ m256i avx2 - Shift all
u32lanes left by the loweru64lane ofcount. - shl_
all_ u64_ m128i sse2 - Shift all
u64lanes to the left by thecountin the loweru64lane. - shl_
all_ u64_ m256i avx2 - Shift all
u64lanes left by the loweru64lane ofcount. - shl_
each_ u32_ m128i avx2 - Shift
u32values to the left bycountbits. - shl_
each_ u32_ m256i avx2 - Lanewise
u32shift left by the matchingi32lane incount. - shl_
each_ u64_ m128i avx2 - Shift
u64values to the left bycountbits. - shl_
each_ u64_ m256i avx2 - Lanewise
u64shift left by the matchingu64lane incount. - shl_
imm_ u16_ m128i sse2 - Shifts all
u16lanes left by an immediate. - shl_
imm_ u16_ m256i avx2 - Shifts all
u16lanes left by an immediate. - shl_
imm_ u32_ m128i sse2 - Shifts all
u32lanes left by an immediate. - shl_
imm_ u32_ m256i avx2 - Shifts all
u32lanes left by an immediate. - shl_
imm_ u64_ m128i sse2 - Shifts both
u64lanes left by an immediate. - shl_
imm_ u64_ m256i avx2 - Shifts all
u64lanes left by an immediate. - shr_
all_ i16_ m128i sse2 - Shift each
i16lane to the right by thecountin the loweri64lane. - shr_
all_ i16_ m256i avx2 - Lanewise
i16shift right by the loweri64lane ofcount. - shr_
all_ i32_ m128i sse2 - Shift each
i32lane to the right by thecountin the loweri64lane. - shr_
all_ i32_ m256i avx2 - Lanewise
i32shift right by the loweri64lane ofcount. - shr_
all_ u16_ m128i sse2 - Shift each
u16lane to the right by thecountin the loweru64lane. - shr_
all_ u16_ m256i avx2 - Lanewise
u16shift right by the loweru64lane ofcount. - shr_
all_ u32_ m128i sse2 - Shift each
u32lane to the right by thecountin the loweru64lane. - shr_
all_ u32_ m256i avx2 - Lanewise
u32shift right by the loweru64lane ofcount. - shr_
all_ u64_ m128i sse2 - Shift each
u64lane to the right by thecountin the loweru64lane. - shr_
all_ u64_ m256i avx2 - Lanewise
u64shift right by the loweru64lane ofcount. - shr_
each_ i32_ m128i avx2 - Shift
i32values to the right bycountbits. - shr_
each_ i32_ m256i avx2 - Lanewise
i32shift right by the matchingi32lane incount. - shr_
each_ u32_ m128i avx2 - Shift
u32values to the left bycountbits. - shr_
each_ u32_ m256i avx2 - Lanewise
u32shift right by the matchingu32lane incount. - shr_
each_ u64_ m128i avx2 - Shift
u64values to the left bycountbits. - shr_
each_ u64_ m256i avx2 - Lanewise
u64shift right by the matchingi64lane incount. - shr_
imm_ i16_ m128i sse2 - Shifts all
i16lanes right by an immediate. - shr_
imm_ i16_ m256i avx2 - Shifts all
i16lanes left by an immediate. - shr_
imm_ i32_ m128i sse2 - Shifts all
i32lanes right by an immediate. - shr_
imm_ i32_ m256i avx2 - Shifts all
i32lanes left by an immediate. - shr_
imm_ u16_ m128i sse2 - Shifts all
u16lanes right by an immediate. - shr_
imm_ u16_ m256i avx2 - Shifts all
u16lanes right by an immediate. - shr_
imm_ u32_ m128i sse2 - Shifts all
u32lanes right by an immediate. - shr_
imm_ u32_ m256i avx2 - Shifts all
u32lanes right by an immediate. - shr_
imm_ u64_ m128i sse2 - Shifts both
u64lanes right by an immediate. - shr_
imm_ u64_ m256i avx2 - Shifts all
u64lanes right by an immediate. - shuffle_
abi_ f32_ all_ m128 sse - Shuffle the
f32lanes from$aand$btogether using an immediate control value. - shuffle_
abi_ f64_ all_ m128d sse2 - Shuffle the
f64lanes from$aand$btogether using an immediate control value. - shuffle_
abi_ i128z_ all_ m256i avx2 - Shuffle 128 bits of integer data from
$aand$busing an immediate control value. - shuffle_
ai_ f32_ all_ m128i sse2 - Shuffle the
i32lanes in$ausing an immediate control value. - shuffle_
ai_ f64_ all_ m256d avx2 - Shuffle the
f64lanes from$ausing an immediate control value. - shuffle_
ai_ i16_ h64all_ m128i sse2 - Shuffle the high
i16lanes in$ausing an immediate control value. - shuffle_
ai_ i16_ h64half_ m256i avx2 - Shuffle the high
i16lanes in$ausing an immediate control value. - shuffle_
ai_ i16_ l64all_ m128i sse2 - Shuffle the low
i16lanes in$ausing an immediate control value. - shuffle_
ai_ i16_ l64half_ m256i avx2 - Shuffle the low
i16lanes in$ausing an immediate control value. - shuffle_
ai_ i32_ half_ m256i avx2 - Shuffle the
i32lanes inausing an immediate control value. - shuffle_
ai_ i64_ all_ m256i avx2 - Shuffle the
f64lanes in$ausing an immediate control value. - shuffle_
av_ f32_ all_ m128 avx - Shuffle
f32values inausingi32values inv. - shuffle_
av_ f32_ half_ m256 avx - Shuffle
f32values inausingi32values inv. - shuffle_
av_ f64_ all_ m128d avx - Shuffle
f64lanes inausing bit 1 of thei64lanes inv - shuffle_
av_ f64_ half_ m256d avx - Shuffle
f64lanes inausing bit 1 of thei64lanes inv. - shuffle_
av_ i8z_ all_ m128i ssse3 - Shuffle
i8lanes inausingi8values inv. - shuffle_
av_ i8z_ half_ m256i avx2 - Shuffle
i8lanes inausingi8values inv. - shuffle_
av_ i32_ all_ m256 avx2 - Shuffle
f32lanes inausingi32values inv. - shuffle_
av_ i32_ all_ m256i avx2 - Shuffle
i32lanes inausingi32values inv. - shuffle_
m256 avx - Shuffle the
f32lanes fromaandbtogether using an immediate control value. - shuffle_
m256d avx - Shuffle the
f64lanes fromaandbtogether using an immediate control value. - sign_
apply_ i8_ m128i ssse3 - Applies the sign of
i8values inbto the values ina. - sign_
apply_ i8_ m256i avx2 - Lanewise
a * signum(b)with lanes asi8 - sign_
apply_ i16_ m128i ssse3 - Applies the sign of
i16values inbto the values ina. - sign_
apply_ i16_ m256i avx2 - Lanewise
a * signum(b)with lanes asi16 - sign_
apply_ i32_ m128i ssse3 - Applies the sign of
i32values inbto the values ina. - sign_
apply_ i32_ m256i avx2 - Lanewise
a * signum(b)with lanes asi32 - splat_
i8_ m128i_ s_ m128i avx2 - Splat the lowest 8-bit lane across the entire 128 bits.
- splat_
i16_ m128i_ s_ m128i avx2 - Splat the lowest 16-bit lane across the entire 128 bits.
- splat_
i32_ m128i_ s_ m128i avx2 - Splat the lowest 32-bit lane across the entire 128 bits.
- splat_
i64_ m128i_ s_ m128i avx2 - Splat the lowest 64-bit lane across the entire 128 bits.
- splat_
m128_ s_ m128 avx2 - Splat the lowest
f32across all four lanes. - splat_
m128d_ s_ m128d avx2 - Splat the lower
f64across both lanes ofm128d. - splat_
m128i_ m256i avx2 - Splat the 128-bits across 256-bits.
- sqrt_
m128 sse - Lanewise
sqrt(a). - sqrt_
m256 avx - Lanewise
sqrtonf32lanes. - sqrt_
m128_ s sse - Low lane
sqrt(a), other lanes unchanged. - sqrt_
m128d sse2 - Lanewise
sqrt(a). - sqrt_
m128d_ s sse2 - Low lane
sqrt(b), upper lane is unchanged froma. - sqrt_
m256d avx - Lanewise
sqrtonf64lanes. - store_
high_ m128d_ s sse2 - Stores the high lane value to the reference given.
- store_
i64_ m128i_ s sse2 - Stores the value to the reference given.
- store_
m128 sse - Stores the value to the reference given.
- store_
m256 avx - Store data from a register into memory.
- store_
m128_ s sse - Stores the low lane value to the reference given.
- store_
m128d sse2 - Stores the value to the reference given.
- store_
m128d_ s sse2 - Stores the low lane value to the reference given.
- store_
m128i sse2 - Stores the value to the reference given.
- store_
m256d avx - Store data from a register into memory.
- store_
m256i avx - Store data from a register into memory.
- store_
masked_ i32_ m128i avx2 - Stores the
i32masked lanes given to the reference. - store_
masked_ i32_ m256i avx2 - Stores the
i32masked lanes given to the reference. - store_
masked_ i64_ m128i avx2 - Stores the
i32masked lanes given to the reference. - store_
masked_ i64_ m256i avx2 - Stores the
i32masked lanes given to the reference. - store_
masked_ m128 avx - Store data from a register into memory according to a mask.
- store_
masked_ m256 avx - Store data from a register into memory according to a mask.
- store_
masked_ m128d avx - Store data from a register into memory according to a mask.
- store_
masked_ m256d avx - Store data from a register into memory according to a mask.
- store_
reverse_ m128 sse - Stores the value to the reference given in reverse order.
- store_
reversed_ m128d sse2 - Stores the value to the reference given.
- store_
splat_ m128 sse - Stores the low lane value to all lanes of the reference given.
- store_
splat_ m128d sse2 - Stores the low lane value to all lanes of the reference given.
- store_
unaligned_ hi_ lo_ m256 avx - Store data from a register into memory.
- store_
unaligned_ hi_ lo_ m256d avx - Store data from a register into memory.
- store_
unaligned_ hi_ lo_ m256i avx - Store data from a register into memory.
- store_
unaligned_ m128 sse - Stores the value to the reference given.
- store_
unaligned_ m256 avx - Store data from a register into memory.
- store_
unaligned_ m128d sse2 - Stores the value to the reference given.
- store_
unaligned_ m128i sse2 - Stores the value to the reference given.
- store_
unaligned_ m256d avx - Store data from a register into memory.
- store_
unaligned_ m256i avx - Store data from a register into memory.
- sub_
horizontal_ i16_ m128i ssse3 - Subtract horizontal pairs of
i16values, pack the outputs asathenb. - sub_
horizontal_ i16_ m256i avx2 - Horizontal
a - bwith lanes asi16. - sub_
horizontal_ i32_ m128i ssse3 - Subtract horizontal pairs of
i32values, pack the outputs asathenb. - sub_
horizontal_ i32_ m256i avx2 - Horizontal
a - bwith lanes asi32. - sub_
horizontal_ m128 sse3 - Subtract each lane horizontally, pack the outputs as
athenb. - sub_
horizontal_ m256 avx - Subtract adjacent
f32lanes. - sub_
horizontal_ m128d sse3 - Subtract each lane horizontally, pack the outputs as
athenb. - sub_
horizontal_ m256d avx - Subtract adjacent
f64lanes. - sub_
horizontal_ saturating_ i16_ m128i ssse3 - Subtract horizontal pairs of
i16values, saturating, pack the outputs asathenb. - sub_
horizontal_ saturating_ i16_ m256i avx2 - Horizontal saturating
a - bwith lanes asi16. - sub_
i8_ m128i sse2 - Lanewise
a - bwith lanes asi8. - sub_
i8_ m256i avx2 - Lanewise
a - bwith lanes asi8. - sub_
i16_ m128i sse2 - Lanewise
a - bwith lanes asi16. - sub_
i16_ m256i avx2 - Lanewise
a - bwith lanes asi16. - sub_
i32_ m128i sse2 - Lanewise
a - bwith lanes asi32. - sub_
i32_ m256i avx2 - Lanewise
a - bwith lanes asi32. - sub_
i64_ m128i sse2 - Lanewise
a - bwith lanes asi64. - sub_
i64_ m256i avx2 - Lanewise
a - bwith lanes asi64. - sub_
m128 sse - Lanewise
a - b. - sub_
m256 avx - Lanewise
a - bwithf32lanes. - sub_
m128_ s sse - Low lane
a - b, other lanes unchanged. - sub_
m128d sse2 - Lanewise
a - b. - sub_
m128d_ s sse2 - Lowest lane
a - b, high lane unchanged. - sub_
m256d avx - Lanewise
a - bwithf64lanes. - sub_
saturating_ i8_ m128i sse2 - Lanewise saturating
a - bwith lanes asi8. - sub_
saturating_ i8_ m256i avx2 - Lanewise saturating
a - bwith lanes asi8. - sub_
saturating_ i16_ m128i sse2 - Lanewise saturating
a - bwith lanes asi16. - sub_
saturating_ i16_ m256i avx2 - Lanewise saturating
a - bwith lanes asi16. - sub_
saturating_ u8_ m128i sse2 - Lanewise saturating
a - bwith lanes asu8. - sub_
saturating_ u8_ m256i avx2 - Lanewise saturating
a - bwith lanes asu8. - sub_
saturating_ u16_ m128i sse2 - Lanewise saturating
a - bwith lanes asu16. - sub_
saturating_ u16_ m256i avx2 - Lanewise saturating
a - bwith lanes asu16. - sum_
of_ u8_ abs_ diff_ m128i sse2 - Compute “sum of
u8absolute differences”. - sum_
of_ u8_ abs_ diff_ m256i avx2 - Compute “sum of
u8absolute differences”. - test_
all_ ones_ m128i sse4.1 - Tests if all bits are 1.
- test_
all_ zeroes_ m128i sse4.1 - Returns if all masked bits are 0,
(a & mask) as u128 == 0 - test_
mixed_ ones_ and_ zeroes_ m128i sse4.1 - Returns if, among the masked bits, there’s both 0s and 1s
- testc_
m128 avx - Compute the bitwise of sign bit NOT of
aand then AND withb, returns 1 if the result is zero, otherwise 0. - testc_
m256 avx - Compute the bitwise of sign bit NOT of
aand then AND withb, returns 1 if the result is zero, otherwise 0. - testc_
m128d avx - Compute the bitwise of sign bit NOT of
aand then AND withb, returns 1 if the result is zero, otherwise 0. - testc_
m128i sse4.1 - Compute the bitwise NOT of
aand then AND withb, returns 1 if the result is zero, otherwise 0. - testc_
m256d avx - Compute the bitwise of sign bit NOT of
aand then AND withb, returns 1 if the result is zero, otherwise 0. - testc_
m256i avx - Compute the bitwise NOT of
aand then AND withb, returns 1 if the result is zero, otherwise 0. - testz_
m128 avx - Computes the bitwise AND of 256 bits in
aandb, returns 1 if the result is zero, otherwise 0. - testz_
m256 avx - Computes the bitwise AND of 256 bits in
aandb, returns 1 if the result is zero, otherwise 0. - testz_
m128d avx - Computes the bitwise of sign bitAND of 256 bits in
aandb, returns 1 if the result is zero, otherwise 0. - testz_
m128i sse4.1 - Computes the bitwise AND of 256 bits in
aandb, returns 1 if the result is zero, otherwise 0. - testz_
m256d avx - Computes the bitwise of sign bit AND of 256 bits in
aandb, returns 1 if the result is zero, otherwise 0. - testz_
m256i avx - Computes the bitwise of sign bit AND of 256 bits in
aandb, returns 1 if the result is zero, otherwise 0. - trailing_
zero_ count_ u32 bmi1 - Counts the number of trailing zero bits in a
u32. - trailing_
zero_ count_ u64 bmi1 - Counts the number of trailing zero bits in a
u64. - transpose_
four_ m128 sse - Transpose four
m128as if they were a 4x4 matrix. - truncate_
m128_ to_ m128i sse2 - Truncate the
f32lanes toi32lanes. - truncate_
m128d_ to_ m128i sse2 - Truncate the
f64lanes to the loweri32lanes (upperi32lanes 0). - truncate_
to_ i32_ m128d_ s sse2 - Truncate the lower lane into an
i32. - truncate_
to_ i64_ m128d_ s sse2 - Truncate the lower lane into an
i64. - unpack_
hi_ m256 avx - Unpack and interleave the high lanes.
- unpack_
hi_ m256d avx - Unpack and interleave the high lanes.
- unpack_
high_ i8_ m128i sse2 - Unpack and interleave high
i8lanes ofaandb. - unpack_
high_ i8_ m256i avx2 - Unpack and interleave high
i8lanes ofaandb. - unpack_
high_ i16_ m128i sse2 - Unpack and interleave high
i16lanes ofaandb. - unpack_
high_ i16_ m256i avx2 - Unpack and interleave high
i16lanes ofaandb. - unpack_
high_ i32_ m128i sse2 - Unpack and interleave high
i32lanes ofaandb. - unpack_
high_ i32_ m256i avx2 - Unpack and interleave high
i32lanes ofaandb. - unpack_
high_ i64_ m128i sse2 - Unpack and interleave high
i64lanes ofaandb. - unpack_
high_ i64_ m256i avx2 - Unpack and interleave high
i64lanes ofaandb. - unpack_
high_ m128 sse - Unpack and interleave high lanes of
aandb. - unpack_
high_ m128d sse2 - Unpack and interleave high lanes of
aandb. - unpack_
lo_ m256 avx - Unpack and interleave the high lanes.
- unpack_
lo_ m256d avx - Unpack and interleave the high lanes.
- unpack_
low_ i8_ m128i sse2 - Unpack and interleave low
i8lanes ofaandb. - unpack_
low_ i8_ m256i avx2 - Unpack and interleave low
i8lanes ofaandb. - unpack_
low_ i16_ m128i sse2 - Unpack and interleave low
i16lanes ofaandb. - unpack_
low_ i16_ m256i avx2 - Unpack and interleave low
i16lanes ofaandb. - unpack_
low_ i32_ m128i sse2 - Unpack and interleave low
i32lanes ofaandb. - unpack_
low_ i32_ m256i avx2 - Unpack and interleave low
i32lanes ofaandb. - unpack_
low_ i64_ m128i sse2 - Unpack and interleave low
i64lanes ofaandb. - unpack_
low_ i64_ m256i avx2 - Unpack and interleave low
i64lanes ofaandb. - unpack_
low_ m128 sse - Unpack and interleave low lanes of
aandb. - unpack_
low_ m128d sse2 - Unpack and interleave low lanes of
aandb. - zero_
extend_ m128 avx - Zero extend an
m128tom256 - zero_
extend_ m128d avx - Zero extend an
m128dtom256d - zero_
extend_ m128i avx - Zero extend an
m128itom256i - zeroed_
m128 sse - All lanes zero.
- zeroed_
m256 avx - A zeroed
m256 - zeroed_
m128d sse2 - Both lanes zero.
- zeroed_
m128i sse2 - All lanes zero.
- zeroed_
m256d avx - A zeroed
m256d - zeroed_
m256i avx - A zeroed
m256i