[][src]Module safe_arch::intel

Types and functions for safe x86 / x86_64 intrinsic usage.

x86_64 is essentially a superset of x86, so we just lump it all into one module.

Naming Conventions

The actual intrinsic names are a flaming dumpster, so we use easier to understand names.

  • The general naming scheme is that the operation of the function is followed by the name of the type it operates on:
  • If the function affects only the lowest lane then it has _s on the end after the type, because that's a "scalar" operation.
  • Many functions with a "bool-ish" return values have _mask in their name. These are the comparison functions, and the return value is all 0s in a lane for "false" in that lane, and all 1s in a lane for "true" in that lane. Because a float or double point value of all 1s is NaN, the mask making functions aren't generally useful on their own, they're just an intermediate value.
  • convert functions will round to an approximate numeric value.
  • cast functions will preserve the bit patterns involved.

Structs

m128

The data for a 128-bit SSE register of four f32 lanes.

m128d

The data for a 128-bit SSE register of two f64 values.

m128i

The data for a 128-bit SSE register of integer data.

Functions

add_i16_m128i

Lanewise a + b with lanes as i16.

add_i32_m128i

Lanewise a + b with lanes as i32.

add_i64_m128i

Lanewise a + b with lanes as i64.

add_i8_m128i

Lanewise a + b with lanes as i8.

add_m128

Lanewise a + b.

add_m128_s

Low lane a + b, other lanes unchanged.

add_m128d

Lanewise a + b.

add_m128d_s

Lowest lane a + b, high lane unchanged.

add_saturating_i16_m128i

Lanewise saturating a + b with lanes as i16.

add_saturating_i8_m128i

Lanewise saturating a + b with lanes as i8.

add_saturating_u16_m128i

Lanewise saturating a + b with lanes as u16.

add_saturating_u8_m128i

Lanewise saturating a + b with lanes as u8.

and_m128

Bitwise a & b.

and_m128d

Bitwise a & b.

and_m128i

Bitwise a & b.

andnot_m128

Bitwise (!a) & b.

andnot_m128d

Bitwise (!a) & b.

andnot_m128i

Bitwise (!a) & b.

average_u16_m128i

Lanewise saturating a + b with lanes as u16.

average_u8_m128i

Lanewise saturating a + b with lanes as u8.

cast_to_m128_from_m128d

Bit-preserving cast to m128 from m128d

cast_to_m128_from_m128i

Bit-preserving cast to m128 from m128i

cast_to_m128d_from_m128

Bit-preserving cast to m128d from m128

cast_to_m128d_from_m128i

Bit-preserving cast to m128d from m128i

cast_to_m128i_from_m128d

Bit-preserving cast to m128i from m128d

cast_to_m128i_from_m128

Bit-preserving cast to m128i from m128

cmp_eq_i32_m128_s

Low lane equality.

cmp_eq_i32_m128d_s

Low lane f64 equal to.

cmp_eq_mask_i16_m128i

Lanewise a == b with lanes as i16.

cmp_eq_mask_i32_m128i

Lanewise a == b with lanes as i32.

cmp_eq_mask_i8_m128i

Lanewise a == b with lanes as i8.

cmp_eq_mask_m128

Lanewise a == b.

cmp_eq_mask_m128_s

Low lane a == b, other lanes unchanged.

cmp_eq_mask_m128d

Lanewise a == b, mask output.

cmp_eq_mask_m128d_s

Low lane a == b, other lanes unchanged.

cmp_ge_i32_m128_s

Low lane greater than or equal to.

cmp_ge_i32_m128d_s

Low lane f64 greater than or equal to.

cmp_ge_mask_m128

Lanewise a >= b.

cmp_ge_mask_m128_s

Low lane a >= b, other lanes unchanged.

cmp_ge_mask_m128d

Lanewise a >= b.

cmp_ge_mask_m128d_s

Low lane a >= b, other lanes unchanged.

cmp_gt_i32_m128_s

Low lane greater than.

cmp_gt_i32_m128d_s

Low lane f64 greater than.

cmp_gt_mask_i16_m128i

Lanewise a > b with lanes as i16.

cmp_gt_mask_i32_m128i

Lanewise a > b with lanes as i32.

cmp_gt_mask_i8_m128i

Lanewise a > b with lanes as i8.

cmp_gt_mask_m128

Lanewise a > b.

cmp_gt_mask_m128_s

Low lane a > b, other lanes unchanged.

cmp_gt_mask_m128d

Lanewise a > b.

cmp_gt_mask_m128d_s

Low lane a > b, other lanes unchanged.

cmp_le_i32_m128_s

Low lane less than or equal to.

cmp_le_i32_m128d_s

Low lane f64 less than or equal to.

cmp_le_mask_m128

Lanewise a <= b.

cmp_le_mask_m128_s

Low lane a <= b, other lanes unchanged.

cmp_le_mask_m128d

Lanewise a <= b.

cmp_le_mask_m128d_s

Low lane a <= b, other lanes unchanged.

cmp_lt_i32_m128_s

Low lane less than.

cmp_lt_i32_m128d_s

Low lane f64 less than.

cmp_lt_mask_i16_m128i

Lanewise a < b with lanes as i16.

cmp_lt_mask_i32_m128i

Lanewise a < b with lanes as i32.

cmp_lt_mask_i8_m128i

Lanewise a < b with lanes as i8.

cmp_lt_mask_m128

Lanewise a < b.

cmp_lt_mask_m128_s

Low lane a < b, other lanes unchanged.

cmp_lt_mask_m128d

Lanewise a < b.

cmp_lt_mask_m128d_s

Low lane a < b, other lane unchanged.

cmp_neq_i32_m128_s

Low lane not equal to.

cmp_neq_i32_m128d_s

Low lane f64 less than.

cmp_neq_mask_m128

Lanewise a != b.

cmp_neq_mask_m128_s

Low lane a != b, other lanes unchanged.

cmp_neq_mask_m128d

Lanewise a != b.

cmp_neq_mask_m128d_s

Low lane a != b, other lane unchanged.

cmp_nge_mask_m128

Lanewise !(a >= b).

cmp_nge_mask_m128_s

Low lane !(a >= b), other lanes unchanged.

cmp_nge_mask_m128d

Lanewise !(a >= b).

cmp_nge_mask_m128d_s

Low lane !(a >= b), other lane unchanged.

cmp_ngt_mask_m128

Lanewise !(a > b).

cmp_ngt_mask_m128_s

Low lane !(a > b), other lanes unchanged.

cmp_ngt_mask_m128d

Lanewise !(a > b).

cmp_ngt_mask_m128d_s

Low lane !(a > b), other lane unchanged.

cmp_nle_mask_m128

Lanewise !(a <= b).

cmp_nle_mask_m128_s

Low lane !(a <= b), other lanes unchanged.

cmp_nle_mask_m128d

Lanewise !(a <= b).

cmp_nle_mask_m128d_s

Low lane !(a <= b), other lane unchanged.

cmp_nlt_mask_m128

Lanewise !(a < b).

cmp_nlt_mask_m128_s

Low lane !(a < b), other lanes unchanged.

cmp_nlt_mask_m128d

Lanewise !(a < b).

cmp_nlt_mask_m128d_s

Low lane !(a < b), other lane unchanged.

cmp_ord_mask_m128

Lanewise (!a.is_nan()) & (!b.is_nan()).

cmp_ord_mask_m128_s

Low lane (!a.is_nan()) & (!b.is_nan()), other lanes unchanged.

cmp_ord_mask_m128d

Lanewise (!a.is_nan()) & (!b.is_nan()).

cmp_ord_mask_m128d_s

Low lane (!a.is_nan()) & (!b.is_nan()), other lane unchanged.

cmp_unord_mask_m128

Lanewise a.is_nan() | b.is_nan().

cmp_unord_mask_m128_s

Low lane a.is_nan() | b.is_nan(), other lanes unchanged.

cmp_unord_mask_m128d

Lanewise a.is_nan() | b.is_nan().

cmp_unord_mask_m128d_s

Low lane a.is_nan() | b.is_nan(), other lane unchanged.

convert_i32_replace_m128_s

Convert i32 to f32 and replace the low lane of the input.

convert_i32_replace_m128d_s

Convert i32 to f64 and replace the low lane of the input.

convert_i64_replace_m128d_s

Convert i64 to f64 and replace the low lane of the input.

convert_m128_s_replace_m128d_s

Converts the lower f32 to f64 and replace the low lane of the input

convert_m128d_s_replace_m128_s

Converts the low f64 to f32 and replaces the low lane of the input.

convert_to_m128_from_m128i

Rounds the four i32 lanes to four f32 lanes.

convert_to_m128_from_m128d

Rounds the two f64 lanes to the low two f32 lanes.

convert_to_m128d_from_m128i

Rounds the lower two i32 lanes to two f64 lanes.

convert_to_m128d_from_m128

Rounds the two f64 lanes to the low two f32 lanes.

convert_to_m128i_from_m128d

Rounds the two f64 lanes to the low two i32 lanes.

convert_to_m128i_from_m128

Rounds the two f64 lanes to the low two i32 lanes.

copy_i64_m128i_s

Copy the low i64 lane to a new register, upper bits 0.

copy_replace_low_f64_m128d

Copies the a value and replaces the low lane with the low b value.

div_m128

Lanewise a / b.

div_m128_s

Low lane a / b, other lanes unchanged.

div_m128d

Lanewise a / b.

div_m128d_s

Lowest lane a / b, high lane unchanged.

get_f32_from_m128_s

Gets the low lane as an individual f32 value.

get_f64_from_m128d_s

Gets the lower lane as an f64 value.

get_i32_from_m128_s

Converts the low lane to i32 and extracts as an individual value.

get_i32_from_m128d_s

Converts the lower lane to an i32 value.

get_i32_from_m128i_s

Converts the lower lane to an i32 value.

get_i64_from_m128d_s

Converts the lower lane to an i64 value.

get_i64_from_m128i_s

Converts the lower lane to an i64 value.

load_f32_m128_s

Loads the reference into the low lane of the register.

load_f64_m128d_s

Loads the reference into the low lane of the register.

load_i64_m128i_s

Loads the low i64 into a register.

load_m128

Loads the reference into a register.

load_m128d

Loads the reference into a register.

load_m128i

Loads the reference into a register.

load_replace_high_m128d

Loads the reference into a register, replacing the high lane.

load_replace_low_m128d

Loads the reference into a register, replacing the low lane.

load_reverse_m128

Loads the reference into a register with reversed order.

load_reverse_m128d

Loads the reference into a register with reversed order.

load_splat_m128

Loads the reference into all lanes of a register.

load_splat_m128d

Loads the reference into all lanes of a register.

load_unaligned_m128

Loads the reference into a register.

load_unaligned_m128d

Loads the reference into a register.

load_unaligned_m128i

Loads the reference into a register.

max_i16_m128i

Lanewise max(a, b) with lanes as i16.

max_m128

Lanewise max(a, b).

max_m128_s

Low lane max(a, b), other lanes unchanged.

max_m128d

Lanewise max(a, b).

max_m128d_s

Low lane max(a, b), other lanes unchanged.

max_u8_m128i

Lanewise max(a, b) with lanes as u8.

min_i16_m128i

Lanewise min(a, b) with lanes as i16.

min_m128

Lanewise min(a, b).

min_m128_s

Low lane min(a, b), other lanes unchanged.

min_m128d

Lanewise min(a, b).

min_m128d_s

Low lane min(a, b), other lanes unchanged.

min_u8_m128i

Lanewise min(a, b) with lanes as u8.

move_high_low_m128

Move the high lanes of b to the low lanes of a, other lanes unchanged.

move_low_high_m128

Move the low lanes of b to the high lanes of a, other lanes unchanged.

move_m128_s

Move the low lane of b to a, other lanes unchanged.

move_mask_i8_m128i

Gathers the i8 sign bit of each lane.

move_mask_m128

Gathers the sign bit of each lane.

move_mask_m128d

Gathers the sign bit of each lane.

mul_i16_horizontal_add_m128i

Multiply i16 lanes producing i32 values, horizontal add pairs of i32 values to produce the final output.

mul_i16_keep_high_m128i

Lanewise a * b with lanes as i16, keep the high bits of the i32 intermediates.

mul_i16_keep_low_m128i

Lanewise a * b with lanes as i16, keep the low bits of the i32 intermediates.

mul_m128

Lanewise a * b.

mul_m128_s

Low lane a * b, other lanes unchanged.

mul_m128d

Lanewise a * b.

mul_m128d_s

Lowest lane a * b, high lane unchanged.

mul_u16_keep_high_m128i

Lanewise a * b with lanes as u16, keep the high bits of the u32 intermediates.

mul_u64_widen_low_bits_m128i

Multiplies the lower 32 bits (only) of each u64 lane into 64-bit u64 values.

or_m128

Bitwise a | b.

or_m128d

Bitwise a | b.

or_m128i

Bitwise a | b.

pack_i16_to_i8_m128i

Saturating convert i16 to i8, and interleave the values.

pack_i16_to_u8_m128i

Saturating convert i16 to u8, and interleave the values.

pack_i32_to_i16_m128i

Saturating convert i32 to i16, and interleave the values.

reciprocal_m128

Lanewise 1.0 / a approximation.

reciprocal_m128_s

Low lane 1.0 / a approximation, other lanes unchanged.

reciprocal_sqrt_m128

Lanewise 1.0 / sqrt(a) approximation.

reciprocal_sqrt_m128_s

Low lane 1.0 / sqrt(a) approximation, other lanes unchanged.

set_i16_m128i

Sets the args into an m128i, first arg is the high lane.

set_i32_m128i_s

Set an i32 as the low 32-bit lane of an m128i, other lanes blank.

set_i32_m128i

Sets the args into an m128i, first arg is the high lane.

set_i64_m128i_s

Set an i64 as the low 64-bit lane of an m128i, other lanes blank.

set_i64_m128i

Sets the args into an m128i, first arg is the high lane.

set_i8_m128i

Sets the args into an m128i, first arg is the high lane.

set_m128

Sets the args into an m128, first arg is the high lane.

set_m128_s

Sets the args into an m128, first arg is the high lane.

set_m128d

Sets the args into an m128d, first arg is the high lane.

set_m128d_s

Sets the args into the low lane of a m128d.

set_reversed_i16_m128i

Sets the args into an m128i, first arg is the low lane.

set_reversed_i32_m128i

Sets the args into an m128i, first arg is the low lane.

set_reversed_i8_m128i

Sets the args into an m128i, first arg is the low lane.

set_reversed_m128

Sets the args into an m128, first arg is the low lane.

set_reversed_m128d

Sets the args into an m128d, first arg is the low lane.

shift_left_i16_m128i

Shift each i16 lane to the left by the count in the lower i64 lane.

shift_left_i32_m128i

Shift each i32 lane to the left by the count in the lower i64 lane.

shift_left_i64_m128i

Shift each i64 lane to the left by the count in the lower i64 lane.

shift_right_i16_m128i

Shift each i16 lane to the right by the count in the lower i64 lane.

shift_right_i32_m128i

Shift each i32 lane to the right by the count in the lower i64 lane.

shift_right_u16_m128i

Shift each u16 lane to the right by the count in the lower i64 lane.

shift_right_u32_m128i

Shift each u32 lane to the right by the count in the lower i64 lane.

shift_right_u64_m128i

Shift each u64 lane to the right by the count in the lower i64 lane.

splat_i16_m128i

Splats the i16 to all lanes of the m128i.

splat_i32_m128i

Splats the i32 to all lanes of the m128i.

splat_i64_m128i

Splats the i64 to both lanes of the m128i.

splat_i8_m128i

Splats the i8 to all lanes of the m128i.

splat_m128

Splats the value to all lanes.

splat_m128d

Splats the args into both lanes of the m128d.

sqrt_m128

Lanewise sqrt(a).

sqrt_m128_s

Low lane sqrt(a), other lanes unchanged.

sqrt_m128d

Lanewise sqrt(a).

sqrt_m128d_s

Low lane sqrt(b), upper lane is unchanged from a.

store_high_m128d_s

Stores the high lane value to the reference given.

store_i64_m128i_s

Stores the value to the reference given.

store_m128

Stores the value to the reference given.

store_m128_s

Stores the low lane value to the reference given.

store_m128d

Stores the value to the reference given.

store_m128d_s

Stores the low lane value to the reference given.

store_m128i

Stores the value to the reference given.

store_reverse_m128

Stores the value to the reference given in reverse order.

store_reversed_m128d

Stores the value to the reference given.

store_splat_m128

Stores the low lane value to all lanes of the reference given.

store_splat_m128d

Stores the low lane value to all lanes of the reference given.

store_unaligned_m128

Stores the value to the reference given.

store_unaligned_m128d

Stores the value to the reference given.

store_unaligned_m128i

Stores the value to the reference given.

sub_i16_m128i

Lanewise a - b with lanes as i16.

sub_i32_m128i

Lanewise a - b with lanes as i32.

sub_i64_m128i

Lanewise a - b with lanes as i64.

sub_i8_m128i

Lanewise a - b with lanes as i8.

sub_m128

Lanewise a - b.

sub_m128_s

Low lane a - b, other lanes unchanged.

sub_m128d

Lanewise a - b.

sub_m128d_s

Lowest lane a - b, high lane unchanged.

sub_saturating_i16_m128i

Lanewise saturating a - b with lanes as i16.

sub_saturating_i8_m128i

Lanewise saturating a - b with lanes as i8.

sub_saturating_u16_m128i

Lanewise saturating a - b with lanes as u16.

sub_saturating_u8_m128i

Lanewise saturating a - b with lanes as u8.

sum_of_u8_abs_diff_m128i

Compute "sum of u8 absolute differences".

transpose_four_m128

Transpose four m128 as if they were a 4x4 matrix.

truncate_m128_to_m128i

Truncate the f32 lanes to i32 lanes.

truncate_m128d_to_m128i

Truncate the f64 lanes to the lower i32 lanes (upper i32 lanes 0).

truncate_to_i32_m128d_s

Truncate the lower lane into an i32.

truncate_to_i64_m128d_s

Truncate the lower lane into an i64.

unpack_high_i16_m128i

Unpack and interleave high i16 lanes of a and b.

unpack_high_i32_m128i

Unpack and interleave high i32 lanes of a and b.

unpack_high_i64_m128i

Unpack and interleave high i64 lanes of a and b.

unpack_high_i8_m128i

Unpack and interleave high i8 lanes of a and b.

unpack_high_m128

Unpack and interleave high lanes of a and b.

unpack_high_m128d

Unpack and interleave high lanes of a and b.

unpack_low_i16_m128i

Unpack and interleave low i16 lanes of a and b.

unpack_low_i32_m128i

Unpack and interleave low i32 lanes of a and b.

unpack_low_i64_m128i

Unpack and interleave low i64 lanes of a and b.

unpack_low_i8_m128i

Unpack and interleave low i8 lanes of a and b.

unpack_low_m128

Unpack and interleave low lanes of a and b.

unpack_low_m128d

Unpack and interleave low lanes of a and b.

xor_m128

Bitwise a ^ b.

xor_m128d

Bitwise a ^ b.

xor_m128i

Bitwise a ^ b.

zeroed_m128

All lanes zero.

zeroed_m128i

All lanes zero.

zeroed_m128d

Both lanes zero.