Module typed_arch::x86_64 [−] [src]

x86_64 intrinsics

Structs

CpuidResult	[ Experimental ] Result of the `cpuid` instruction.
Round

Functions

__cpuid ^⚠	[ Experimental ] See `__cpuid_count`.
__cpuid_count ^⚠	[ Experimental ] Returns the result of the `cpuid` instruction for a given `leaf` (`EAX`) and `sub_leaf` (`ECX`).
__get_cpuid_max ^⚠	[ Experimental ] Returns the highest-supported `leaf` (`EAX`) and sub-leaf (`ECX`) `cpuid` values.
__readeflags ^⚠	[ Experimental ] Reads EFLAGS.
__writeeflags ^⚠	[ Experimental ] Write EFLAGS.
_bswap ^⚠	[ Experimental ] Return an integer with the reversed byte order of x
_bswap64 ^⚠	[ Experimental ] Return an integer with the reversed byte order of x
_lzcnt_u64 ^⚠	[ Experimental ] Counts the leading most significant zero bits.
_mm256_add_pd ^⚠	Add
_mm256_blendv_epi8 ^⚠	Blend packed 8-bit integers from `a` and `b` using `mask`.
_mm256_insert_epi64 ^⚠	Copy `a` to result, and insert the 64-bit integer `i` into result at the location specified by `index`.
_mm256_sqrt_pd ^⚠	Square root
_mm_add_epi8 ^⚠	Add
_mm_add_epi16 ^⚠	Add
_mm_add_epi32 ^⚠	Add
_mm_add_epi64 ^⚠	Add
_mm_add_pd ^⚠	Add packed double-precision (64-bit) floating-point elements in `a` and `b`.
_mm_add_sd ^⚠	Return a new vector with the low element of `a` replaced by the sum of the low elements of `a` and `b`.
_mm_add_si64 ^⚠	[ Experimental ] Adds two signed or unsigned 64-bit integer values, returning the lower 64 bits of the sum.
_mm_adds_epi8 ^⚠	Saturated add
_mm_adds_epi16 ^⚠	Saturated add
_mm_adds_epu8 ^⚠	Saturated add
_mm_adds_epu16 ^⚠	Saturated add
_mm_addsub_pd ^⚠	Alternatively add and subtract packed double-precision (64-bit) floating-point elements in `a` to/from packed elements in `b`.
_mm_addsub_ps ^⚠	Alternatively add and subtract packed single-precision (32-bit) floating-point elements in `a` to/from packed elements in `b`.
_mm_and_pd ^⚠	Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in `a` and `b`.
_mm_and_ps ^⚠	Bitwise AND of packed single-precision (32-bit) floating-point elements.
_mm_and_si128 ^⚠	[ Experimental ] Compute the bitwise AND of 128 bits (representing integer data) in `a` and `b`.
_mm_andnot_pd ^⚠	Compute the bitwise NOT of `a` and then AND with `b`.
_mm_andnot_si128 ^⚠	[ Experimental ] Compute the bitwise NOT of 128 bits (representing integer data) in `a` and then AND with `b`.
_mm_avg_epu8 ^⚠	Average
_mm_avg_epu16 ^⚠	Average
_mm_blendv_epi8 ^⚠	Blend packed 8-bit integers from `a` and `b` using `mask`
_mm_bslli_si128 ^⚠	[ Experimental ] Shift `a` left by `imm8` bytes while shifting in zeros.
_mm_bsrli_si128 ^⚠	[ Experimental ] Shift `a` right by `imm8` bytes while shifting in zeros.
_mm_castpd_ps ^⚠	Casts a 128-bit floating-point vector of [2 x double] into a 128-bit floating-point vector of [4 x float].
_mm_castpd_si128 ^⚠	Casts a 128-bit floating-point vector of [2 x double] into a 128-bit integer vector.
_mm_castps_pd ^⚠	Casts a 128-bit floating-point vector of [4 x float] into a 128-bit floating-point vector of [2 x double].
_mm_castps_si128 ^⚠	Casts a 128-bit floating-point vector of [4 x float] into a 128-bit integer vector.
_mm_castsi128_pd ^⚠	Casts a 128-bit integer vector into a 128-bit floating-point vector of [2 x double].
_mm_castsi128_ps ^⚠	Casts a 128-bit integer vector into a 128-bit floating-point vector of [4 x float].
_mm_ceil_ps ^⚠	Round the elements in `a` up to an integer value.
_mm_clflush ^⚠	[ Experimental ] Invalidate and flush the cache line that contains `p` from all levels of the cache hierarchy.
_mm_cmpeq_epi8 ^⚠	Equal
_mm_cmpeq_epi16 ^⚠	Equal
_mm_cmpeq_epi32 ^⚠	Equal
_mm_cmpeq_pd ^⚠	Compare corresponding elements in `a` and `b` for equality.
_mm_cmpeq_sd ^⚠	Return a new vector with the low element of `a` replaced by the equality comparison of the lower elements of `a` and `b`.
_mm_cmpge_pd ^⚠	Compare corresponding elements in `a` and `b` for greater-than-or-equal.
_mm_cmpge_sd ^⚠	Return a new vector with the low element of `a` replaced by the greater-than-or-equal comparison of the lower elements of `a` and `b`.
_mm_cmpgt_epi8 ^⚠	Greater-than
_mm_cmpgt_epi16 ^⚠	Greater-than
_mm_cmpgt_epi32 ^⚠	Greater-than
_mm_cmpgt_pd ^⚠	Compare corresponding elements in `a` and `b` for greater-than.
_mm_cmpgt_sd ^⚠	Return a new vector with the low element of `a` replaced by the greater-than comparison of the lower elements of `a` and `b`.
_mm_cmple_pd ^⚠	Compare corresponding elements in `a` and `b` for less-than-or-equal
_mm_cmple_sd ^⚠	Return a new vector with the low element of `a` replaced by the less-than-or-equal comparison of the lower elements of `a` and `b`.
_mm_cmplt_epi8 ^⚠	Less-than
_mm_cmplt_epi16 ^⚠	Less-than
_mm_cmplt_epi32 ^⚠	Less-than
_mm_cmplt_pd ^⚠	Compare corresponding elements in `a` and `b` for less-than.
_mm_cmplt_sd ^⚠	Return a new vector with the low element of `a` replaced by the less-than comparison of the lower elements of `a` and `b`.
_mm_cmpneq_pd ^⚠	Compare corresponding elements in `a` and `b` for not-equal.
_mm_cmpneq_sd ^⚠	Return a new vector with the low element of `a` replaced by the not-equal comparison of the lower elements of `a` and `b`.
_mm_cmpnge_pd ^⚠	Compare corresponding elements in `a` and `b` for not-greater-than-or-equal.
_mm_cmpnge_sd ^⚠	Return a new vector with the low element of `a` replaced by the not-greater-than-or-equal comparison of the lower elements of `a` and `b`.
_mm_cmpngt_pd ^⚠	Compare corresponding elements in `a` and `b` for not-greater-than.
_mm_cmpngt_sd ^⚠	Return a new vector with the low element of `a` replaced by the not-greater-than comparison of the lower elements of `a` and `b`.
_mm_cmpnle_pd ^⚠	Compare corresponding elements in `a` and `b` for not-less-than-or-equal.
_mm_cmpnle_sd ^⚠	Return a new vector with the low element of `a` replaced by the not-less-than-or-equal comparison of the lower elements of `a` and `b`.
_mm_cmpnlt_pd ^⚠	Compare corresponding elements in `a` and `b` for not-less-than.
_mm_cmpnlt_sd ^⚠	Return a new vector with the low element of `a` replaced by the not-less-than comparison of the lower elements of `a` and `b`.
_mm_cmpord_pd ^⚠	Compare corresponding elements in `a` and `b` to see if neither is `NaN`.
_mm_cmpord_sd ^⚠	Return a new vector with the low element of `a` replaced by the result of comparing both of the lower elements of `a` and `b` to `NaN`. If neither are equal to `NaN` then `0xFFFFFFFFFFFFFFFF` is used and `0` otherwise.
_mm_cmpunord_pd ^⚠	Compare corresponding elements in `a` and `b` to see if either is `NaN`.
_mm_cmpunord_sd ^⚠	Return a new vector with the low element of `a` replaced by the result of comparing both of the lower elements of `a` and `b` to `NaN`. If either is equal to `NaN` then `0xFFFFFFFFFFFFFFFF` is used and `0` otherwise.
_mm_comieq_sd ^⚠	Compare the lower element of `a` and `b` for equality.
_mm_comige_sd ^⚠	Compare the lower element of `a` and `b` for greater-than-or-equal.
_mm_comigt_sd ^⚠	Compare the lower element of `a` and `b` for greater-than.
_mm_comile_sd ^⚠	Compare the lower element of `a` and `b` for less-than-or-equal.
_mm_comilt_sd ^⚠	Compare the lower element of `a` and `b` for less-than.
_mm_comineq_sd ^⚠	Compare the lower element of `a` and `b` for not-equal.
_mm_cvtepi32_pd ^⚠	Converts lower two packed 32-bit integers in `a` to `f64`s.
_mm_cvtepi32_ps ^⚠	Conversion
_mm_cvtepu8_epi16 ^⚠	Zero extend packed unsigned 8-bit integers in `a` to packed 16-bit integers
_mm_cvtpd_epi32 ^⚠	Convert packed double-precision (64-bit) floating-point elements in `a` to packed 32-bit integers.
_mm_cvtpd_pi32 ^⚠	Converts the two double-precision floating-point elements of a 128-bit vector of [2 x double] into two signed 32-bit integer values, returned in a 64-bit vector of [2 x i32].
_mm_cvtpd_ps ^⚠	Convert packed double-precision (64-bit) floating-point elements in "a" to packed single-precision (32-bit) floating-point elements
_mm_cvtpi32_pd ^⚠	Converts the two signed 32-bit integer elements of a 64-bit vector of [2 x i32] into two double-precision floating-point values, returned in a 128-bit vector of [2 x double].
_mm_cvtps_epi32 ^⚠	Conversion
_mm_cvtps_pd ^⚠	Convert packed single-precision (32-bit) floating-point elements in `a` to packed double-precision (64-bit) floating-point elements.
_mm_cvtsd_f64 ^⚠	Return the lower double-precision (64-bit) floating-point element of "a".
_mm_cvtsd_si32 ^⚠	Convert the lower double-precision (64-bit) floating-point element in a to a 32-bit integer.
_mm_cvtsd_si64 ^⚠	Convert the lower double-precision (64-bit) floating-point element in a to a 64-bit integer.
_mm_cvtsd_si64x ^⚠	Alias for `_mm_cvtsd_si64`
_mm_cvtsd_ss ^⚠	Convert the lower double-precision (64-bit) floating-point element in `b` to a single-precision (32-bit) floating-point element, store the result in the lower element of the return value, and copy the upper element from `a` to the upper element the return value.
_mm_cvtsi128_si32 ^⚠	Extracts lowest element of `a`.
_mm_cvtsi128_si64 ^⚠	Return the lowest element of `a`.
_mm_cvtsi128_si64x ^⚠	Return the lowest element of `a`.
_mm_cvtsi32_sd ^⚠	Replaces lower element of `a` with `b`.
_mm_cvtsi32_si128 ^⚠	Instantiates `[a, 0, 0, 0]`
_mm_cvtsi64_sd ^⚠	Return `a` with its lower element replaced by `b` after converting it to an `f64`.
_mm_cvtsi64_si128 ^⚠	Return a vector whose lowest element is `a` and all higher elements are `0`.
_mm_cvtsi64x_sd ^⚠	Return `a` with its lower element replaced by `b` after converting it to an `f64`.
_mm_cvtsi64x_si128 ^⚠	Return a vector whose lowest element is `a` and all higher elements are `0`.
_mm_cvtss_sd ^⚠	Convert the lower single-precision (32-bit) floating-point element in `b` to a double-precision (64-bit) floating-point element, store the result in the lower element of the return value, and copy the upper element from `a` to the upper element the return value.
_mm_cvttpd_epi32 ^⚠	Convert packed double-precision (64-bit) floating-point elements in `a` to packed 32-bit integers with truncation.
_mm_cvttpd_pi32 ^⚠	Converts the two double-precision floating-point elements of a 128-bit vector of [2 x double] into two signed 32-bit integer values, returned in a 64-bit vector of [2 x i32]. If the result of either conversion is inexact, the result is truncated (rounded towards zero) regardless of the current MXCSR setting.
_mm_cvttps_epi32 ^⚠	Convert packed single-precision (32-bit) floating-point elements in `a` to packed 32-bit integers with truncation.
_mm_cvttsd_si32 ^⚠	Convert the lower double-precision (64-bit) floating-point element in `a` to a 32-bit integer with truncation.
_mm_cvttsd_si64 ^⚠	Convert the lower double-precision (64-bit) floating-point element in `a` to a 64-bit integer with truncation.
_mm_cvttsd_si64x ^⚠	Alias for `_mm_cvttsd_si64`
_mm_div_pd ^⚠	Divide packed double-precision (64-bit) floating-point elements in `a` by packed elements in `b`.
_mm_div_sd ^⚠	Return a new vector with the low element of `a` replaced by the result of diving the lower element of `a` by the lower element of `b`.
_mm_extract_epi16 ^⚠	Return the `i`-th element of `a`.
_mm_floor_ps ^⚠	Round the elements in `a` down to an integer value.
_mm_hadd_pd ^⚠	Horizontally add adjacent pairs of double-precision (64-bit) floating-point elements in `a` and `b`, and pack the results.
_mm_hadd_ps ^⚠	Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in `a` and `b`, and pack the results.
_mm_hadds_epi16 ^⚠	Horizontally add the adjacent pairs of values contained in 2 packed 128-bit vectors of [8 x i16]. Positive sums greater than 7FFFh are saturated to 7FFFh. Negative sums less than 8000h are saturated to 8000h.
_mm_hsub_pd ^⚠	Horizontally subtract adjacent pairs of double-precision (64-bit) floating-point elements in `a` and `b`, and pack the results.
_mm_hsub_ps ^⚠	Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in `a` and `b`, and pack the results.
_mm_hsubs_epi16 ^⚠	Horizontally subtract the adjacent pairs of values contained in 2 packed 128-bit vectors of [8 x i16]. Positive differences greater than 7FFFh are saturated to 7FFFh. Negative differences less than 8000h are saturated to 8000h.
_mm_insert_epi16 ^⚠	Return a new vector where the `i`-th element of `a` is replaced with `v`.
_mm_lfence ^⚠	[ Experimental ] Perform a serializing operation on all load-from-memory instructions that were issued prior to this instruction.
_mm_load1_pd ^⚠	Load a double-precision (64-bit) floating-point element from memory into both elements of returned vector.
_mm_load_pd ^⚠	Load 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from memory into the returned vector. `mem_addr` must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_load_pd1 ^⚠	Load a double-precision (64-bit) floating-point element from memory into both elements of returned vector.
_mm_load_sd ^⚠	Loads a 64-bit double-precision value to the low element of a 128-bit integer vector and clears the upper element.
_mm_load_si128 ^⚠	[ Experimental ] Load 128-bits of integer data from memory into a new vector.
_mm_loadh_pd ^⚠	Loads a double-precision value into the high-order bits of a 128-bit vector of [2 x double]. The low-order bits are copied from the low-order bits of the first operand.
_mm_loadl_epi64 ^⚠	Load 64-bit integer from memory into first element of returned vector.
_mm_loadl_pd ^⚠	Loads a double-precision value into the low-order bits of a 128-bit vector of [2 x double]. The high-order bits are copied from the high-order bits of the first operand.
_mm_loadr_pd ^⚠	Load 2 double-precision (64-bit) floating-point elements from memory into the returned vector in reverse order. `mem_addr` must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_loadu_pd ^⚠	Load 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from memory into the returned vector. `mem_addr` does not need to be aligned on any particular boundary.
_mm_loadu_si128 ^⚠	[ Experimental ] Load 128-bits of integer data from memory into a new vector.
_mm_madd_epi16 ^⚠	Multiply and horizontally add
_mm_maskmoveu_si128 ^⚠	Conditionally store elements from `a` into memory using `mask`.
_mm_max_epi16 ^⚠	Max
_mm_max_epu8 ^⚠	Max
_mm_max_pd ^⚠	Return a new vector with the maximum values from corresponding elements in `a` and `b`.
_mm_max_sd ^⚠	Return a new vector with the low element of `a` replaced by the maximum of the lower elements of `a` and `b`.
_mm_mfence ^⚠	[ Experimental ] Perform a serializing operation on all load-from-memory and store-to-memory instructions that were issued prior to this instruction.
_mm_min_epi16 ^⚠	Min
_mm_min_epu8 ^⚠	Min
_mm_min_pd ^⚠	Return a new vector with the minimum values from corresponding elements in `a` and `b`.
_mm_min_sd ^⚠	Return a new vector with the low element of `a` replaced by the minimum of the lower elements of `a` and `b`.
_mm_move_epi64 ^⚠	Instantiate vector with the low element extracted from `a` and its upper element is zero.
_mm_move_sd ^⚠	Constructs a 128-bit floating-point vector of [2 x double]. The lower 64 bits are set to the lower 64 bits of the second parameter. The upper 64 bits are set to the upper 64 bits of the first parameter.
_mm_movemask_epi8 ^⚠	Return a mask of the most significant bit of each element in `a`.
_mm_movemask_pd ^⚠	Return a mask of the most significant bit of each element in `a`.
_mm_movepi64_pi64 ^⚠	Returns the lower 64 bits of a 128-bit integer vector as a 64-bit integer.
_mm_movpi64_epi64 ^⚠	Moves the 64-bit operand to a 128-bit integer vector, zeroing the upper bits.
_mm_mul_epu32 ^⚠	Multiply the low unsigned 32-bit integers from each packed 64-bit element
_mm_mul_pd ^⚠	Multiply packed double-precision (64-bit) floating-point elements in `a` and `b`.
_mm_mul_sd ^⚠	Return a new vector with the low element of `a` replaced by multiplying the low elements of `a` and `b`.
_mm_mul_su32 ^⚠	Multiplies 32-bit unsigned integer values contained in the lower bits of the two 64-bit integer vectors and returns the 64-bit unsigned product.
_mm_mulhi_epi16 ^⚠	Multiply returning high 16 bits of the result.
_mm_mulhi_epu16 ^⚠	Multiply returning high 16 bits of the result.
_mm_mullo_epi16 ^⚠	Multiply returning low 16 bits of the result.
_mm_or_pd ^⚠	Compute the bitwise OR of `a` and `b`.
_mm_or_si128 ^⚠	[ Experimental ] Compute the bitwise OR of 128 bits (representing integer data) in `a` and `b`.
_mm_packs_epi16 ^⚠	Convert elements of `a` and `b` to 8-bit integers using signed saturation.
_mm_packs_epi32 ^⚠	Convert elements of `a` and `b` to 16-bit integers using signed saturation.
_mm_packus_epi16 ^⚠	Convert elements of `a` and `b` to 8-bit integers using unsigned saturation.
_mm_pause ^⚠	[ Experimental ] Provide a hint to the processor that the code sequence is a spin-wait loop.
_mm_rcp_ps ^⚠	Reciprocal (approximate)
_mm_round_ps ^⚠	Round the elements in `a` using the `rounding` parameter.
_mm_rsqrt_ps ^⚠	Reciprocal square root (approximate).
_mm_sad_epu8 ^⚠	Sum absolute differences
_mm_set1_epi8 ^⚠	Broadcast `a` to all elements.
_mm_set1_epi16 ^⚠	Broadcast `a` to all elements.
_mm_set1_epi32 ^⚠	Broadcast `a` to all elements.
_mm_set1_epi64 ^⚠	Initializes both values in a 128-bit vector of [2 x i64] with the specified 64-bit value.
_mm_set1_epi64x ^⚠	Broadcast `a` to all elements.
_mm_set1_pd ^⚠	Broadcast double-precision (64-bit) floating-point value a to all elements of the return value.
_mm_set_epi8 ^⚠	Instantiate
_mm_set_epi16 ^⚠	Instantiate
_mm_set_epi32 ^⚠	Instantiate
_mm_set_epi64 ^⚠	Initializes both 64-bit values in a 128-bit vector of [2 x i64] with the specified 64-bit integer values.
_mm_set_epi64x ^⚠	Instantiate
_mm_set_pd ^⚠	Set packed double-precision (64-bit) floating-point elements in the return value with the supplied values.
_mm_set_pd1 ^⚠	Broadcast double-precision (64-bit) floating-point value a to all elements of the return value.
_mm_set_sd ^⚠	Copy double-precision (64-bit) floating-point element `a` to the lower element of the packed 64-bit return value.
_mm_setr_epi8 ^⚠	Instantiate with values in reverse order
_mm_setr_epi16 ^⚠	Instantiate with values in reverse order
_mm_setr_epi32 ^⚠	Instantiate with values in reverse order
_mm_setr_epi64 ^⚠	Constructs a 128-bit integer vector, initialized in reverse order with the specified 64-bit integral values.
_mm_setr_pd ^⚠	Set packed double-precision (64-bit) floating-point elements in the return value with the supplied values in reverse order.
_mm_setzero_pd ^⚠	Returns packed double-precision (64-bit) floating-point elements with all zeros.
_mm_setzero_si128 ^⚠	[ Experimental ] Returns a vector with all elements set to zero.
_mm_shuffle_epi8 ^⚠	Shuffle bytes from `a` according to the content of `b`.
_mm_shuffle_epi32 ^⚠	Shuffle `a` using the `control`.
_mm_shuffle_pd ^⚠	Constructs a 128-bit floating-point vector of [2 x double] from two 128-bit vector parameters of [2 x double], using the `control`.
_mm_shuffle_ps ^⚠	Shuffle elements in `a` and `b` using `mask`.
_mm_shufflehi_epi16 ^⚠	Shuffle integers in the high 64 bits of `a` using the `control`
_mm_shufflelo_epi16 ^⚠	Shuffle integers in the low 64 bits of `a` using the `control`
_mm_sll_epi16 ^⚠	Left shift (shifting in zeros).
_mm_sll_epi32 ^⚠	Left shift (shifting in zeros).
_mm_sll_epi64 ^⚠	Left shift (shifting in zeros).
_mm_slli_epi16 ^⚠	Left shift by `n` while shifting in zeros.
_mm_slli_epi32 ^⚠	Left shift by `n` while shifting in zeros.
_mm_slli_epi64 ^⚠	Left shift by `n` while shifting in zeros.
_mm_slli_si128 ^⚠	[ Experimental ] Shift `a` left by `imm8` bytes while shifting in zeros.
_mm_sqrt_pd ^⚠	Return a new vector with the square root of each of the values in `a`.
_mm_sqrt_ps ^⚠	Square root.
_mm_sqrt_sd ^⚠	Return a new vector with the low element of `a` replaced by the square root of the lower element `b`.
_mm_sra_epi16 ^⚠	Right shift (shifting in sign bits).
_mm_sra_epi32 ^⚠	Right shift (shifting in sign bits).
_mm_srai_epi16 ^⚠	Right shift by `n` while shifting in sign bits.
_mm_srai_epi32 ^⚠	Right shift by `n` while shifting in sign bits.
_mm_srl_epi16 ^⚠	Right shift (shifting in zeros).
_mm_srl_epi32 ^⚠	Right shift (shifting in zeros).
_mm_srl_epi64 ^⚠	Right shift (shifting in zeros).
_mm_srli_epi16 ^⚠	Right shift by `n` while shifting in zeros.
_mm_srli_epi32 ^⚠	Right shift by `n` while shifting in zeros.
_mm_srli_epi64 ^⚠	Right shift by `n` while shifting in zeros.
_mm_srli_si128 ^⚠	[ Experimental ] Shift `a` right by `imm8` bytes while shifting in zeros.
_mm_store1_pd ^⚠	Store the lower double-precision (64-bit) floating-point element from `a` into 2 contiguous elements in memory. `mem_addr` must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_store_pd ^⚠	Store 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from `a` into memory. `mem_addr` must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_store_pd1 ^⚠	Store the lower double-precision (64-bit) floating-point element from `a` into 2 contiguous elements in memory. `mem_addr` must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_store_sd ^⚠	Stores the lower 64 bits of a 128-bit vector of [2 x double] to a memory location.
_mm_store_si128 ^⚠	[ Experimental ] Store 128-bits of integer data from `a` into memory.
_mm_storeh_pd ^⚠	Stores the upper 64 bits of a 128-bit vector of [2 x double] to a memory location.
_mm_storel_epi64 ^⚠	Store the lower integer of `a` to a memory location.
_mm_storel_pd ^⚠	Stores the lower 64 bits of a 128-bit vector of [2 x double] to a memory location.
_mm_storer_pd ^⚠	Store 2 double-precision (64-bit) floating-point elements from `a` into memory in reverse order. `mem_addr` must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_storeu_pd ^⚠	Store 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) from `a` into memory. `mem_addr` does not need to be aligned on any particular boundary.
_mm_storeu_si128 ^⚠	[ Experimental ] Store 128-bits of integer data from `a` into memory.
_mm_stream_pd ^⚠	Stores a 128-bit floating point vector of [2 x double] to a 128-bit aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
_mm_stream_si32 ^⚠	[ Experimental ] Stores a 32-bit integer value in the specified memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
_mm_stream_si64 ^⚠	[ Experimental ] Stores a 64-bit integer value in the specified memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
_mm_stream_si128 ^⚠	[ Experimental ] Stores a 128-bit integer vector to a 128-bit aligned memory location. To minimize caching, the data is flagged as non-temporal (unlikely to be used again soon).
_mm_sub_epi8 ^⚠	Subtract
_mm_sub_epi16 ^⚠	Subtract
_mm_sub_epi32 ^⚠	Subtract
_mm_sub_epi64 ^⚠	Subtract
_mm_sub_pd ^⚠	Subtract packed double-precision (64-bit) floating-point elements in `b` from `a`.
_mm_sub_sd ^⚠	Return a new vector with the low element of `a` replaced by subtracting the low element by `b` from the low element of `a`.
_mm_sub_si64 ^⚠	[ Experimental ] Subtracts signed or unsigned 64-bit integer values and writes the difference to the corresponding bits in the destination.
_mm_subs_epi8 ^⚠	Saturated subtract
_mm_subs_epi16 ^⚠	Saturated subtract
_mm_subs_epu8 ^⚠	Saturated subtract
_mm_subs_epu16 ^⚠	Saturated subtract
_mm_ucomieq_sd ^⚠	Compare the lower element of `a` and `b` for equality.
_mm_ucomige_sd ^⚠	Compare the lower element of `a` and `b` for greater-than-or-equal.
_mm_ucomigt_sd ^⚠	Compare the lower element of `a` and `b` for greater-than.
_mm_ucomile_sd ^⚠	Compare the lower element of `a` and `b` for less-than-or-equal.
_mm_ucomilt_sd ^⚠	Compare the lower element of `a` and `b` for less-than.
_mm_ucomineq_sd ^⚠	Compare the lower element of `a` and `b` for not-equal.
_mm_undefined_pd ^⚠	Return vector of type f64x2 with undefined elements.
_mm_undefined_si128 ^⚠	[ Experimental ] Return vector of type __m128i with undefined elements.
_mm_unpackhi_epi8 ^⚠	Unpack and interleave integers from the high half of `a` and `b`.
_mm_unpackhi_epi16 ^⚠	Unpack and interleave integers from the high half of `a` and `b`.
_mm_unpackhi_epi32 ^⚠	Unpack and interleave integers from the high half of `a` and `b`.
_mm_unpackhi_epi64 ^⚠	Unpack and interleave integers from the high half of `a` and `b`.
_mm_unpackhi_pd ^⚠	The resulting `f64x2` element is composed by the low-order values of the two `f64x2` interleaved input elements, i.e.:
_mm_unpacklo_epi8 ^⚠	Unpack and interleave integers from the low half of `a` and `b`.
_mm_unpacklo_epi16 ^⚠	Unpack and interleave integers from the low half of `a` and `b`.
_mm_unpacklo_epi32 ^⚠	Unpack and interleave integers from the low half of `a` and `b`.
_mm_unpacklo_epi64 ^⚠	Unpack and interleave integers from the low half of `a` and `b`.
_mm_unpacklo_pd ^⚠	The resulting `f64x2` element is composed by the high-order values of the two `f64x2` interleaved input elements, i.e.:
_mm_xor_pd ^⚠	Compute the bitwise OR of `a` and `b`.
_mm_xor_si128 ^⚠	[ Experimental ] Compute the bitwise XOR of 128 bits (representing integer data) in `a` and `b`.
_popcnt64 ^⚠	[ Experimental ] Counts the bits that are set.
_rdrand16_step ^⚠	Returns a hardware generated 16-bit random value.
_rdrand32_step ^⚠	Read a hardware generated 32-bit random value.
_rdrand64_step ^⚠	Returns a hardware generated 64-bit random value
_rdseed16_step ^⚠	Returns a 16-bit NIST SP800-90B and SP800-90C compliant random value.
_rdseed32_step ^⚠	Returns a 32-bit NIST SP800-90B and SP800-90C compliant random value.
_rdseed64_step ^⚠	Returns a 64-bit NIST SP800-90B and SP800-90C compliant random value.
has_cpuid	[ Experimental ] Does the host support the `cpuid` instruction?