�{setnbe4Set byte if not below or equal (CF == 0 and ZF == 0)setnbeSETHI setnbeSETHI # vaesdeclast,Perform Last Round of an AES Decryption Flow
vaesdeclast vaesdeclast K vaesdeclast / vaesdeclast K / vaesdeclast vaesdeclast K vaesdeclast 2 vaesdeclast K 2 vaesdeclast H vaesdeclast H 5 jbJump if below (CF == 1)jbJCS N jbJCS O vfnmsub213phOFused Negative Multiply-Subtract of Packed Half-Precision Floating-Point Valuesvfnmsub213ph K < vfnmsub213ph K vfnmsub213ph K > vfnmsub213ph K vfnmsub213ph R @ vfnmsub213ph R vfnmsub213ph K < vfnmsub213ph K vfnmsub213ph K > vfnmsub213ph K vfnmsub213ph R @ vfnmsub213ph R vfnmsub213ph R Q vfnmsub213ph R Q vmulpd6Multiply Packed Double-Precision Floating-Point Valuesvmulpd H = vmulpd H vmulpd H ? vmulpd H vmulpd H A vmulpd H vmulpd H = vmulpd vmulpd H vmulpd / vmulpd H ? vmulpd vmulpd H vmulpd 2 vmulpd H A vmulpd H vmulpd H Q vmulpd H Q vaddsubpdPacked Double-FP Add/Subtract vaddsubpd vaddsubpd / vaddsubpd vaddsubpd 2 vpscatterdq=Scatter Packed Quadword Values with Signed Doubleword Indicesvpscatterdq HC vpscatterdq HC vpscatterdq HG vscatterdpdTScatter Packed Double-Precision Floating-Point Values with Signed Doubleword Indicesvscatterdpd HC vscatterdpd HC vscatterdpd HG stcSet Carry FlagstcSTC cmova#Move if above (CF == 0 and ZF == 0)cmovaw cmovaw $ cmoval cmoval ' vpandq/Bitwise Logical AND of Packed Quadword Integersvpandq H = vpandq H vpandq H ? vpandq H vpandq H A vpandq H vpandq H = vpandq H vpandq H ? vpandq H vpandq H A vpandq H
vcvttss2siIConvert with Truncation Scalar Single-Precision FP Value to Dword Integer
vcvttss2si
vcvttss2si H
vcvttss2si '
vcvttss2si H '
vcvttss2si H R cmovnoMove if not overflow (OF == 0)cmovnow cmovnow $ cmovnol cmovnol ' ktestw#Bit Test 16-bit Masks and Set Flagsktestw J minsd;Return Minimum Scalar Double-Precision Floating-Point ValueminsdMINSD minsdMINSD +
aesdeclast,Perform Last Round of an AES Decryption Flow
aesdeclast '
aesdeclast ' / cmppd5Compare Packed Double-Precision Floating-Point ValuescmppdCMPPD cmppdCMPPD / pmovzxwqBMove Packed Word Integers to Quadword Integers with Zero Extensionpmovzxwq pmovzxwq ' vmovshdup(Move Packed Single-FP High and Duplicate vmovshdup H vmovshdup H vmovshdup H vmovshdup H / vmovshdup H 2 vmovshdup H 5 vmovshdup vmovshdup H vmovshdup / vmovshdup H / vmovshdup vmovshdup H vmovshdup 2 vmovshdup H 2 vmovshdup H vmovshdup H 5 vpmovusdwMDown Convert Packed Doubleword Values to Word Values with Unsigned Saturation vpmovusdw H vpmovusdw H, vpmovusdw H vpmovusdw H0 vpmovusdw H vpmovusdw H3 vpmovusdw H vpmovusdw H vpmovusdw H vpmovusdw H+ vpmovusdw H/ vpmovusdw H2 cwdConvert Word to Doublewordcwtd vexpandpdKLoad Sparse Packed Double-Precision Floating-Point Values from Dense Memory vexpandpd K vexpandpd H vexpandpd H vexpandpd K / vexpandpd H 2 vexpandpd H 5 vexpandpd K vexpandpd K / vexpandpd H vexpandpd H 2 vexpandpd H vexpandpd H 5 vpshawPacked Shift Arithmetic Wordsvpshaw " vpshaw " / vpshaw " / vpminuw(Minimum of Packed Unsigned Word Integersvpminuw I vpminuw I / vpminuw I vpminuw I 2 vpminuw I vpminuw I 5 vpminuw vpminuw I vpminuw / vpminuw I / vpminuw ! vpminuw I vpminuw ! 2 vpminuw I 2 vpminuw I vpminuw I 5 vrcpssOCompute Approximate Reciprocal of Scalar Single-Precision Floating-Point Valuesvrcpss vrcpss ' setneSet byte if not equal (ZF == 0)setneSETNE setneSETNE # vcvtph2pdLConvert Packed Half-Precision FP Values to Packed Double-Precision FP Values vcvtph2pd K * vcvtph2pd K . vcvtph2pd R < vcvtph2pd K vcvtph2pd K vcvtph2pd R vcvtph2pd K * vcvtph2pd K vcvtph2pd K . vcvtph2pd K vcvtph2pd R < vcvtph2pd R vcvtph2pd R R vcvtph2pd R R
prefetcht0'Prefetch Data Into Caches using T0 Hint
prefetcht0
PREFETCHT0
# blsic%Isolate Lowest Set Bit and Complementblsic 6 blsic 6 ' vptestmq:Logical AND of Packed Quadword Integer Values and Set Maskvptestmq H = vptestmq H = vptestmq H vptestmq H vptestmq H ? vptestmq H ? vptestmq H vptestmq H vptestmq H A vptestmq H A vptestmq H vptestmq H vroundss3Round Scalar Single Precision Floating-Point Valuesvroundss vroundss ' roundss3Round Scalar Single Precision Floating-Point Valuesroundss roundss ' cmovnbe0Move if not below or equal (CF == 0 and ZF == 0)cmovnbew cmovnbew $ cmovnbel cmovnbel ' vmovq
Move Quadwordvmovq vmovq H vmovq + vmovq H + vmovq + vmovq H+ vpblendmb*Blend Byte Vectors Using an OpMask Control vpblendmb I vpblendmb I / vpblendmb I vpblendmb I 2 vpblendmb I vpblendmb I 5 vpblendmb I vpblendmb I / vpblendmb I vpblendmb I 2 vpblendmb I vpblendmb I 5 jge#Jump if greater or equal (SF == OF)jgeJGE N jgeJGE O cvtps2dqBConvert Packed Single-Precision FP Values to Packed Dword Integerscvtps2dq cvtps2dq / movntpsKStore Packed Single-Precision Floating-Point Values Using Non-Temporal HintmovntpsMOVNTPS / vfmsubsdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Valuesvfmsubsd $ vfmsubsd $ + vfmsubsd $ + movdir64bMOVe to DIRect store 64 Bytes movdir64b 1 5 prefetchw4Prefetch Data into Caches in Anticipation of a Write prefetchw B# setnp Set byte if not parity (PF == 0)setnpSETPC setnpSETPC #
vcvtph2psx>Convert Half-Precision FP Values to Single-Precision FP Values
vcvtph2psx K .
vcvtph2psx K <
vcvtph2psx R >
vcvtph2psx K
vcvtph2psx K
vcvtph2psx R
vcvtph2psx K .
vcvtph2psx K
vcvtph2psx K <
vcvtph2psx K
vcvtph2psx R >
vcvtph2psx R
vcvtph2psx R R
vcvtph2psx R R vpmulld?Multiply Packed Signed Doubleword Integers and Store Low Resultvpmulld H 9 vpmulld H vpmulld H : vpmulld H vpmulld H ; vpmulld H vpmulld H 9 vpmulld vpmulld H vpmulld / vpmulld H : vpmulld ! vpmulld H vpmulld ! 2 vpmulld H ; vpmulld H
vrsqrt14ssaCompute Approximate Reciprocal of a Square Root of a Scalar Single-Precision Floating-Point Value
vrsqrt14ss H
vrsqrt14ss H '
vrsqrt14ss H
vrsqrt14ss H ' vsm3msg2=Perform Final Calculation for the Next Four SM3 Message Wordsvsm3msg2 vsm3msg2 / pslld)Shift Packed Doubleword Data Left Logicalpslld pslld pslld + pslld pslld pslld / vrcpshMCompute Approximate Reciprocal of Scalar Half-Precision Floating-Point Valuesvrcpsh R vrcpsh R $ vrcpsh R vrcpsh R $
vcvtsd2usiSConvert Scalar Double-Precision Floating-Point Value to Unsigned Doubleword Integer
vcvtsd2usi H
vcvtsd2usi H +
vcvtsd2usi H Q vcvtneobf162ps9Convert Odd Elements of Packed BF16 Values to FP32 Valuesvcvtneobf162ps Z / vcvtneobf162ps Z 2 vpaddsw6Add Packed Signed Word Integers with Signed Saturationvpaddsw I vpaddsw I / vpaddsw I vpaddsw I 2 vpaddsw I vpaddsw I 5 vpaddsw vpaddsw I vpaddsw / vpaddsw I / vpaddsw ! vpaddsw I vpaddsw ! 2 vpaddsw I 2 vpaddsw I vpaddsw I 5 vblendps4 Blend Packed Single Precision Floating-Point Valuesvblendps vblendps / vblendps vblendps 2 rolRotate LeftrolbROLB rolbROLB rolbROLB rolwROLW rolwROLW rolwROLW rollROLL rollROLL rollROLL rolbROLB # rolbROLB # rolbROLB # rolwROLW $ rolwROLW $ rolwROLW $ rollROLL ' rollROLL ' rollROLL ' pmovzxwdDMove Packed Word Integers to Doubleword Integers with Zero Extensionpmovzxwd pmovzxwd + vpdpbsudHPacked Dot Product of Signed-by-Unsinged Byte subvectors into Doublewordvpdpbsud X vpdpbsud X / vpdpbsud X vpdpbsud X 2 vgf2p8affineinvqb0Galois Field (2^8) Affine Inverse Transformationvgf2p8affineinvqb K = vgf2p8affineinvqb K vgf2p8affineinvqb K ? vgf2p8affineinvqb K vgf2p8affineinvqb H A vgf2p8affineinvqb H vgf2p8affineinvqb K = vgf2p8affineinvqb vgf2p8affineinvqb K vgf2p8affineinvqb / vgf2p8affineinvqb K ? vgf2p8affineinvqb vgf2p8affineinvqb K vgf2p8affineinvqb 2 vgf2p8affineinvqb H A vgf2p8affineinvqb H cvtpd2psNConvert Packed Double-Precision FP Values to Packed Single-Precision FP Valuescvtpd2psCVTPD2PS cvtpd2psCVTPD2PS / vfmaddsub132pdXFused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Valuesvfmaddsub132pd H = vfmaddsub132pd H vfmaddsub132pd H ? vfmaddsub132pd H vfmaddsub132pd H A vfmaddsub132pd H vfmaddsub132pd H = vfmaddsub132pd # vfmaddsub132pd H vfmaddsub132pd # / vfmaddsub132pd H ? vfmaddsub132pd # vfmaddsub132pd H vfmaddsub132pd # 2 vfmaddsub132pd H A vfmaddsub132pd H vfmaddsub132pd H Q vfmaddsub132pd H Q
vpcmpestrm3Packed Compare Explicit Length Strings, Return Maskvpcmpestrml vpcmpestrml / vreducesdRPerform Reduction Transformation on a Scalar Double-Precision Floating-Point Value vreducesd J vreducesd J + vreducesd J vreducesd J + vshufpd5Shuffle Packed Double-Precision Floating-Point Valuesvshufpd H = vshufpd H vshufpd H ? vshufpd H vshufpd H A vshufpd H vshufpd H = vshufpd vshufpd H vshufpd / vshufpd H ? vshufpd vshufpd H vshufpd 2 vshufpd H A vshufpd H
vpermil2pd:Permute Two-Source Double-Precision Floating-Point Vectors
vpermil2pd "
vpermil2pd " /
vpermil2pd " /
vpermil2pd "
vpermil2pd " 2
vpermil2pd " 2 vmovsd1Move Scalar Double-Precision Floating-Point Value vmovsd H, vmovsd H + vmovsd + vmovsd H + vmovsd + vmovsd H+ vmovsd H vmovsd vmovsd H vfmadd231sdCFused Multiply-Add of Scalar Double-Precision Floating-Point Valuesvfmadd231sd H vfmadd231sd H + vfmadd231sd # vfmadd231sd H vfmadd231sd # + vfmadd231sd H + vfmadd231sd H Q vfmadd231sd H Q pmovzxbdDMove Packed Byte Integers to Doubleword Integers with Zero Extensionpmovzxbd pmovzxbd ' vfrczss7Extract Fraction Scalar Single-Precision Floating Pointvfrczss " vfrczss " '
vrsqrt28sd�Approximation to the Reciprocal Square Root of a Scalar Double-Precision Floating-Point Value with Less Than 2^-28 Relative Error
vrsqrt28sd M
vrsqrt28sd M +
vrsqrt28sd M
vrsqrt28sd M +
vrsqrt28sd M R
vrsqrt28sd M R vaddsubpsPacked Single-FP Add/Subtract vaddsubps vaddsubps / vaddsubps vaddsubps 2 vandpsDBitwise Logical AND of Packed Single-Precision Floating-Point Valuesvandps J 9 vandps J vandps J : vandps J vandps J ; vandps J vandps J 9 vandps vandps J vandps / vandps J : vandps vandps J vandps 2 vandps J ; vandps J vpshldw3Concatenate and Shift Packed Word Data Left Logicalvpshldw K vpshldw K / vpshldw K vpshldw K 2 vpshldw U vpshldw U 5 vpshldw K vpshldw K / vpshldw K vpshldw K 2 vpshldw U vpshldw U 5 pabsb&Packed Absolute Value of Byte Integerspabsb pabsb + pabsb pabsb / vblendpd3Blend Packed Double Precision Floating-Point Valuesvblendpd vblendpd / vblendpd vblendpd 2 vpmuldqDMultiply Packed Signed Doubleword Integers and Store Quadword Resultvpmuldq H = vpmuldq H vpmuldq H ? vpmuldq H vpmuldq H A vpmuldq H vpmuldq H = vpmuldq vpmuldq H vpmuldq / vpmuldq H ? vpmuldq ! vpmuldq H vpmuldq ! 2 vpmuldq H A vpmuldq H vsm4rnds4&Performs Four Rounds of SM4 Encryption vsm4rnds4 vsm4rnds4 / vsm4rnds4 vsm4rnds4 2 vpmacsdqlCPacked Multiply Accumulate Signed Low Doubleword to Signed Quadword vpmacsdql " vpmacsdql " / bsfBit Scan ForwardbsfwBSFW bsfwBSFW $ bsflBSFL bsflBSFL ' vprotwPacked Rotate Wordsvprotw " vprotw " vprotw " / vprotw " / vprotw " / vfnmadd231psLFused Negative Multiply-Add of Packed Single-Precision Floating-Point Valuesvfnmadd231ps H 9 vfnmadd231ps H vfnmadd231ps H : vfnmadd231ps H vfnmadd231ps H ; vfnmadd231ps H vfnmadd231ps H 9 vfnmadd231ps # vfnmadd231ps H vfnmadd231ps # / vfnmadd231ps H : vfnmadd231ps # vfnmadd231ps H vfnmadd231ps # 2 vfnmadd231ps H ; vfnmadd231ps H vfnmadd231ps H Q vfnmadd231ps H Q xaddExchange and AddxaddbXADDB xaddwXADDW xaddlXADDL xaddbXADDB # xaddwXADDW $ xaddlXADDL ' xorps>Bitwise Logical XOR for Single-Precision Floating-Point ValuesxorpsXORPS xorpsXORPS / setnlSet byte if not less (SF == OF)setnlSETGE setnlSETGE # kandnq$Bitwise Logical AND NOT 64-bit Maskskandnq I cmovbMove if below (CF == 1)cmovbw cmovbw $ cmovbl cmovbl ' vinsertf64x4@Insert 256 Bits of Packed Double-Precision Floating-Point Valuesvinsertf64x4 H vinsertf64x4 H 2 vinsertf64x4 H vinsertf64x4 H 2 ud2Undefined Instructionud2 jpoJump if parity odd (PF == 0)jpoJPC N jpoJPC O clcClear Carry FlagclcCLC pshufhwShuffle Packed High WordspshufhwPSHUFHW pshufhwPSHUFHW / kortestwOR 16-bit Masks and Set Flagskortestw H shrLogical Shift RightshrbSHRB shrbSHRB shrbSHRB shrwSHRW shrwSHRW shrwSHRW shrlSHRL shrlSHRL shrlSHRL shrbSHRB # shrbSHRB # shrbSHRB # shrwSHRW $ shrwSHRW $ shrwSHRW $ shrlSHRL ' shrlSHRL ' shrlSHRL ' vpermi2ps\Full Permute of Single-Precision Floating-Point Values From Two Tables Overwriting the Index vpermi2ps H 9 vpermi2ps H vpermi2ps H : vpermi2ps H vpermi2ps H ; vpermi2ps H vpermi2ps H 9 vpermi2ps H vpermi2ps H : vpermi2ps H vpermi2ps H ; vpermi2ps H vporPacked Bitwise Logical ORvpor vpor / vpor ! vpor ! 2 vpermilpd.Permute Double-Precision Floating-Point Values vpermilpd H = vpermilpd H ? vpermilpd H A vpermilpd H = vpermilpd H vpermilpd H vpermilpd H ? vpermilpd H vpermilpd H vpermilpd H A vpermilpd H vpermilpd H vpermilpd H = vpermilpd H = vpermilpd vpermilpd H vpermilpd vpermilpd H vpermilpd / vpermilpd / vpermilpd H ? vpermilpd H ? vpermilpd vpermilpd H vpermilpd vpermilpd H vpermilpd 2 vpermilpd 2 vpermilpd H A vpermilpd H A vpermilpd H vpermilpd H vphadddq:Packed Horizontal Add Signed Doubleword to Signed Quadwordvphadddq " vphadddq " / vfnmsub231pdQFused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Valuesvfnmsub231pd H = vfnmsub231pd H vfnmsub231pd H ? vfnmsub231pd H vfnmsub231pd H A vfnmsub231pd H vfnmsub231pd H = vfnmsub231pd # vfnmsub231pd H vfnmsub231pd # / vfnmsub231pd H ? vfnmsub231pd # vfnmsub231pd H vfnmsub231pd # 2 vfnmsub231pd H A vfnmsub231pd H vfnmsub231pd H Q vfnmsub231pd H Q vmovlhps>Move Packed Single-Precision Floating-Point Values Low to Highvmovlhps vmovlhps H
sha256msg2HPerform a Final Calculation for the Next Four SHA256 Message Doublewords
sha256msg2 (
sha256msg2 ( / setnsSet byte if not sign (SF == 0)setnsSETPL setnsSETPL # vucomisdNUnordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGSvucomisd vucomisd H vucomisd + vucomisd H + vucomisd H R xorpd>Bitwise Logical XOR for Double-Precision Floating-Point ValuesxorpdXORPD xorpdXORPD / vpcmpeqw%Compare Packed Word Data for Equalityvpcmpeqw I vpcmpeqw I vpcmpeqw I / vpcmpeqw I / vpcmpeqw I vpcmpeqw I vpcmpeqw I 2 vpcmpeqw I 2 vpcmpeqw I vpcmpeqw I vpcmpeqw I 5 vpcmpeqw I 5 vpcmpeqw vpcmpeqw / vpcmpeqw ! vpcmpeqw ! 2 vfmadd213sdCFused Multiply-Add of Scalar Double-Precision Floating-Point Valuesvfmadd213sd H vfmadd213sd H + vfmadd213sd # vfmadd213sd H vfmadd213sd # + vfmadd213sd H + vfmadd213sd H Q vfmadd213sd H Q jg&Jump if greater (ZF == 0 and SF == OF)jgJGT N jgJGT O insertps3Insert Packed Single Precision Floating-Point Valueinsertps insertps ' vfmsub231psHFused Multiply-Subtract of Packed Single-Precision Floating-Point Valuesvfmsub231ps H 9 vfmsub231ps H vfmsub231ps H : vfmsub231ps H vfmsub231ps H ; vfmsub231ps H vfmsub231ps H 9 vfmsub231ps # vfmsub231ps H vfmsub231ps # / vfmsub231ps H : vfmsub231ps # vfmsub231ps H vfmsub231ps # 2 vfmsub231ps H ; vfmsub231ps H vfmsub231ps H Q vfmsub231ps H Q vfmadd132sdCFused Multiply-Add of Scalar Double-Precision Floating-Point Valuesvfmadd132sd H vfmadd132sd H + vfmadd132sd # vfmadd132sd H vfmadd132sd # + vfmadd132sd H + vfmadd132sd H Q vfmadd132sd H Q setbe/Set byte if below or equal (CF == 1 or ZF == 1)setbeSETLS setbeSETLS #
vpclmulqdq"Carry-Less Quadword Multiplication
vpclmulqdq
vpclmulqdq K
vpclmulqdq /
vpclmulqdq K /
vpclmulqdq
vpclmulqdq K
vpclmulqdq 2
vpclmulqdq K 2
vpclmulqdq H
vpclmulqdq H 5 vpcmpeqd+Compare Packed Doubleword Data for Equalityvpcmpeqd H 9 vpcmpeqd H 9 vpcmpeqd H vpcmpeqd H vpcmpeqd H : vpcmpeqd H : vpcmpeqd H vpcmpeqd H vpcmpeqd H ; vpcmpeqd H ; vpcmpeqd H vpcmpeqd H vpcmpeqd vpcmpeqd / vpcmpeqd ! vpcmpeqd ! 2 cmovnsMove if not sign (SF == 0)cmovnsw cmovnsw $ cmovnsl cmovnsl ' vmovdqu8Move Unaligned Byte Valuesvmovdqu8 I0 vmovdqu8 I vmovdqu8 I3 vmovdqu8 I vmovdqu8 I6 vmovdqu8 I vmovdqu8 I / vmovdqu8 I 2 vmovdqu8 I 5 vmovdqu8 I vmovdqu8 I / vmovdqu8 I vmovdqu8 I 2 vmovdqu8 I vmovdqu8 I 5 vmovdqu8 I/ vmovdqu8 I2 vmovdqu8 I5 vpshabPacked Shift Arithmetic Bytesvpshab " vpshab " / vpshab " / vsqrtssCCompute Square Root of Scalar Single-Precision Floating-Point Valuevsqrtss H vsqrtss H ' vsqrtss vsqrtss H vsqrtss ' vsqrtss H ' vsqrtss H Q vsqrtss H Q vpermbPermute Byte Integersvpermb T vpermb T / vpermb T vpermb T 2 vpermb T vpermb T 5 vpermb T vpermb T / vpermb T vpermb T 2 vpermb T vpermb T 5 vcmpss5Compare Scalar Single-Precision Floating-Point Valuesvcmpss H vcmpss H vcmpss H ' vcmpss H ' vcmpss vcmpss ' vcmpss H R vcmpss H R vprotdPacked Rotate Doublewordsvprotd " vprotd " vprotd " / vprotd " / vprotd " /
vcvtusi2shFConvert Unsigned Integer to Scalar Half-Precision Floating-Point Valuevcvtusi2shl R vcvtusi2shl R ' vcvtusi2shl R Q paddsb6Add Packed Signed Byte Integers with Signed Saturationpaddsb paddsb + paddsb paddsb / vpmuludq,Multiply Packed Unsigned Doubleword Integersvpmuludq H = vpmuludq H vpmuludq H ? vpmuludq H vpmuludq H A vpmuludq H vpmuludq H = vpmuludq vpmuludq H vpmuludq / vpmuludq H ? vpmuludq ! vpmuludq H vpmuludq ! 2 vpmuludq H A vpmuludq H vinserti64x41Insert 256 Bits of Packed Quadword Integer Valuesvinserti64x4 H vinserti64x4 H 2 vinserti64x4 H vinserti64x4 H 2 vrndscalesh[Round Scalar Half-Precision Floating-Point Value To Include A Given Number Of Fraction Bitsvrndscalesh R vrndscalesh R $ vrndscalesh R vrndscalesh R $ vrndscalesh R R vrndscalesh R R
vpmacssdqlSPacked Multiply Accumulate with Saturation Signed Low Doubleword to Signed Quadword
vpmacssdql "
vpmacssdql " / vpmadd52luqdPacked Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Quadword Accumulatorsvpmadd52luq K = vpmadd52luq K vpmadd52luq K ? vpmadd52luq K vpmadd52luq O A vpmadd52luq O vpmadd52luq K = vpmadd52luq K vpmadd52luq [ vpmadd52luq [ / vpmadd52luq K ? vpmadd52luq K vpmadd52luq [ vpmadd52luq [ 2 vpmadd52luq O A vpmadd52luq O movss2Move Scalar Single-Precision Floating-Point ValuesmovssMOVSS movssMOVSS ' movssMOVSS ' vfmaddcshIFused Multiply-Add of Complex Scalar Half-Precision Floating-Point Values vfmaddcsh R vfmaddcsh R ' vfmaddcsh R vfmaddcsh R ' vfmaddcsh R Q vfmaddcsh R Q vpshldvq@Concatenate and Variable Shift Packed Quadword Data Left Logicalvpshldvq K = vpshldvq K vpshldvq K ? vpshldvq K vpshldvq U A vpshldvq U vpshldvq K = vpshldvq K vpshldvq K ? vpshldvq K vpshldvq U A vpshldvq U vpermpd0Permute Double-Precision Floating-Point Elementsvpermpd H ? vpermpd H A vpermpd H ? vpermpd H vpermpd H vpermpd H A vpermpd H vpermpd H vpermpd H ? vpermpd H ? vpermpd ! vpermpd H vpermpd H vpermpd ! 2 vpermpd H A vpermpd H A vpermpd H vpermpd H vbroadcastss1Broadcast Single-Precision Floating-Point Elementvbroadcastss H vbroadcastss H vbroadcastss H ' vbroadcastss H ' vbroadcastss ! vbroadcastss ' vbroadcastss ! vbroadcastss H vbroadcastss ' vbroadcastss H ' vbroadcastss H vbroadcastss H ' setno"Set byte if not overflow (OF == 0)setnoSETOC setnoSETOC # blcmskMask From Lowest Clear Bitblcmsk 6 blcmsk 6 ' pfacc Packed Floating-Point Accumulatepfacc pfacc + pmaxub(Maximum of Packed Unsigned Byte IntegerspmaxubPMAXUB
pmaxubPMAXUB
+ pmaxubPMAXUB pmaxubPMAXUB / vdppd<Dot Product of Packed Double Precision Floating-Point Valuesvdppd vdppd / das#Decimal Adjust AL after SubtractiondasDAS cvtdq2pdBConvert Packed Dword Integers to Packed Double-Precision FP Valuescvtdq2pd cvtdq2pd + shlx*Logical Shift Left Without Affecting Flagsshlxl 5 shlxl 5 ' vexp2pdyApproximation to the Exponential 2^x of Packed Double-Precision Floating-Point Values with Less Than 2^-23 Relative Errorvexp2pd M A vexp2pd M vexp2pd M A vexp2pd M vexp2pd M R vexp2pd M R punpckhwd7Unpack and Interleave High-Order Words into Doublewords punpckhwd punpckhwd + punpckhwd punpckhwd / vfixupimmss;Fix Up Special Scalar Single-Precision Floating-Point Valuevfixupimmss H vfixupimmss H ' vfixupimmss H vfixupimmss H ' vfixupimmss H R vfixupimmss H R vfmsub132pdHFused Multiply-Subtract of Packed Double-Precision Floating-Point Valuesvfmsub132pd H = vfmsub132pd H vfmsub132pd H ? vfmsub132pd H vfmsub132pd H A vfmsub132pd H vfmsub132pd H = vfmsub132pd # vfmsub132pd H vfmsub132pd # / vfmsub132pd H ? vfmsub132pd # vfmsub132pd H vfmsub132pd # 2 vfmsub132pd H A vfmsub132pd H vfmsub132pd H Q vfmsub132pd H Q vfnmadd132sdLFused Negative Multiply-Add of Scalar Double-Precision Floating-Point Valuesvfnmadd132sd H vfnmadd132sd H + vfnmadd132sd # vfnmadd132sd H vfnmadd132sd # + vfnmadd132sd H + vfnmadd132sd H Q vfnmadd132sd H Q vfrczpd7Extract Fraction Packed Double-Precision Floating-Pointvfrczpd " vfrczpd " / vfrczpd " vfrczpd " 2 vpaddqAdd Packed Quadword Integersvpaddq H = vpaddq H vpaddq H ? vpaddq H vpaddq H A vpaddq H vpaddq H = vpaddq vpaddq H vpaddq / vpaddq H ? vpaddq ! vpaddq H vpaddq ! 2 vpaddq H A vpaddq H psignwPacked Sign of Word Integerspsignw psignw + psignw psignw / vpbroadcastwBroadcast Word Integervpbroadcastw I vpbroadcastw I vpbroadcastw I vpbroadcastw I vpbroadcastw I vpbroadcastw I vpbroadcastw I $ vpbroadcastw I $ vpbroadcastw I $ vpbroadcastw I vpbroadcastw ! vpbroadcastw I vpbroadcastw ! $ vpbroadcastw I $ vpbroadcastw I vpbroadcastw ! vpbroadcastw I vpbroadcastw ! $ vpbroadcastw I $ vpbroadcastw I vpbroadcastw I vpbroadcastw I $ vfnmadd231ssLFused Negative Multiply-Add of Scalar Single-Precision Floating-Point Valuesvfnmadd231ss H vfnmadd231ss H ' vfnmadd231ss # vfnmadd231ss H vfnmadd231ss # ' vfnmadd231ss H ' vfnmadd231ss H Q vfnmadd231ss H Q paddwAdd Packed Word Integerspaddw paddw + paddw paddw / vpbroadcastmw2d?Broadcast Low Word of Mask Register to Packed Doubleword Valuesvpbroadcastmw2d N vpbroadcastmw2d N vpbroadcastmw2d N vpermt2b9Full Permute of Bytes From Two Tables Overwriting a Tablevpermt2b T vpermt2b T / vpermt2b T vpermt2b T 2 vpermt2b T vpermt2b T 5 vpermt2b T vpermt2b T / vpermt2b T vpermt2b T 2 vpermt2b T vpermt2b T 5 vpmovusdbMDown Convert Packed Doubleword Values to Byte Values with Unsigned Saturation vpmovusdb H vpmovusdb H( vpmovusdb H vpmovusdb H, vpmovusdb H vpmovusdb H0 vpmovusdb H vpmovusdb H vpmovusdb H vpmovusdb H' vpmovusdb H+ vpmovusdb H/ vmovntpsKStore Packed Single-Precision Floating-Point Values Using Non-Temporal Hintvmovntps / vmovntps H/ vmovntps 2 vmovntps H2 vmovntps H5
vpternlogd6Bitwise Ternary Logical Operation on Doubleword Values
vpternlogd H 9
vpternlogd H
vpternlogd H :
vpternlogd H
vpternlogd H ;
vpternlogd H
vpternlogd H 9
vpternlogd H
vpternlogd H :
vpternlogd H
vpternlogd H ;
vpternlogd H cvtps2piBConvert Packed Single-Precision FP Values to Packed Dword Integerscvtps2piCVTPS2PL cvtps2piCVTPS2PL + vptestnmw7Logical NAND of Packed Word Integer Values and Set Mask vptestnmw I vptestnmw I vptestnmw I / vptestnmw I / vptestnmw I vptestnmw I vptestnmw I 2 vptestnmw I 2 vptestnmw I vptestnmw I vptestnmw I 5 vptestnmw I 5 vpscatterdd?Scatter Packed Doubleword Values with Signed Doubleword Indicesvpscatterdd HC vpscatterdd HG vpscatterdd HK btBit TestbtwBTW btwBTW btlBTL btlBTL btwBTW $ btwBTW $ btlBTL ' btlBTL ' rdpmc#Read Performance-Monitoring Counterrdpmc - kunpckwd"Unpack and Interleave 16-bit Maskskunpckwd I popPop a Value from the StackpopwPOPW poplPOPL popwPOPW $ poplPOPL ' vphaddbq4Packed Horizontal Add Signed Byte to Signed Quadwordvphaddbq " vphaddbq " / vbroadcastsd1Broadcast Double-Precision Floating-Point Element
vbroadcastsd H vbroadcastsd H vbroadcastsd H + vbroadcastsd H + vbroadcastsd ! vbroadcastsd H vbroadcastsd + vbroadcastsd H + vbroadcastsd H vbroadcastsd H + kortestqOR 64-bit Masks and Set Flagskortestq I vfnmsub231psQFused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Valuesvfnmsub231ps H 9 vfnmsub231ps H vfnmsub231ps H : vfnmsub231ps H vfnmsub231ps H ; vfnmsub231ps H vfnmsub231ps H 9 vfnmsub231ps # vfnmsub231ps H vfnmsub231ps # / vfnmsub231ps H : vfnmsub231ps # vfnmsub231ps H vfnmsub231ps # 2 vfnmsub231ps H ; vfnmsub231ps H vfnmsub231ps H Q vfnmsub231ps H Q pinsrbInsert Bytepinsrb pinsrb # vmovmskpd8Extract Packed Double-Precision Floating-Point Sign Mask vmovmskpd vmovmskpd
vrsqrt28ss�Approximation to the Reciprocal Square Root of a Scalar Single-Precision Floating-Point Value with Less Than 2^-28 Relative Error
vrsqrt28ss M
vrsqrt28ss M '
vrsqrt28ss M
vrsqrt28ss M '
vrsqrt28ss M R
vrsqrt28ss M R vstmxcsrStore MXCSR Register Statevstmxcsr ' adcAdd with CarryadcbADCB adcbADCB adcbADCB adcbADCB # adcwADCW adcwADCW adcwADCW adcwADCW adcwADCW $ adclADCL adclADCL adclADCL adclADCL adclADCL ' adcbADCB # adcbADCB # adcwADCW $ adcwADCW $ adcwADCW $ adclADCL ' adclADCL ' adclADCL '
vcvtudq2ps\Convert Packed Unsigned Doubleword Integers to Packed Single-Precision Floating-Point Values
vcvtudq2ps H 9
vcvtudq2ps H :
vcvtudq2ps H ;
vcvtudq2ps H
vcvtudq2ps H
vcvtudq2ps H
vcvtudq2ps H 9
vcvtudq2ps H
vcvtudq2ps H :
vcvtudq2ps H
vcvtudq2ps H ;
vcvtudq2ps H
vcvtudq2ps H Q
vcvtudq2ps H Q vdivpd4Divide Packed Double-Precision Floating-Point Valuesvdivpd H = vdivpd H vdivpd H ? vdivpd H vdivpd H A vdivpd H vdivpd H = vdivpd vdivpd H vdivpd / vdivpd H ? vdivpd vdivpd H vdivpd 2 vdivpd H A vdivpd H vdivpd H Q vdivpd H Q vprorvq%Variable Rotate Packed Quadword Rightvprorvq H = vprorvq H vprorvq H ? vprorvq H vprorvq H A vprorvq H vprorvq H = vprorvq H vprorvq H ? vprorvq H vprorvq H A vprorvq H cldClear Direction FlagcldCLD mulpd6Multiply Packed Double-Precision Floating-Point ValuesmulpdMULPD mulpdMULPD / pfrcp.Packed Floating-Point Reciprocal Approximationpfrcp pfrcp + prefetchPrefetch Data into Cachesprefetch @# psubq!Subtract Packed Quadword IntegerspsubqPSUBQ psubqPSUBQ + psubqPSUBQ psubqPSUBQ / vdpps<Dot Product of Packed Single Precision Floating-Point Valuesvdpps vdpps / vdpps vdpps 2 vfmsub132sdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Valuesvfmsub132sd H vfmsub132sd H + vfmsub132sd # vfmsub132sd H vfmsub132sd # + vfmsub132sd H + vfmsub132sd H Q vfmsub132sd H Q vmovsh0Move Scalar Half-Precision Floating-Point Valuesvmovsh R% vmovsh R $ vmovsh R $ vmovsh R$ vmovsh R vmovsh R vpdpwsudsXPacked Dot Product of Signed-by-Unsigned Word subvectors into Doubleword with Saturation vpdpwsuds Y vpdpwsuds Y / vpdpwsuds Y vpdpwsuds Y 2 vpmacsdqhDPacked Multiply Accumulate Signed High Doubleword to Signed Quadword vpmacsdqh " vpmacsdqh " / vpshrdvqAConcatenate and Variable Shift Packed Quadword Data Right Logicalvpshrdvq K = vpshrdvq K vpshrdvq K ? vpshrdvq K vpshrdvq U A vpshrdvq U vpshrdvq K = vpshrdvq K vpshrdvq K ? vpshrdvq K vpshrdvq U A vpshrdvq U
vgetmantpdOExtract Normalized Mantissas from Packed Double-Precision Floating-Point Values
vgetmantpd H =
vgetmantpd H ?
vgetmantpd H A
vgetmantpd H
vgetmantpd H
vgetmantpd H
vgetmantpd H =
vgetmantpd H
vgetmantpd H ?
vgetmantpd H
vgetmantpd H A
vgetmantpd H
vgetmantpd H R
vgetmantpd H R vmulph4Multiply Packed Half-Precision Floating-Point Valuesvmulph K < vmulph K vmulph K > vmulph K vmulph R @ vmulph R vmulph K < vmulph K vmulph K > vmulph K vmulph R @ vmulph R vmulph R Q vmulph R Q vpbroadcastdBroadcast Doubleword Integervpbroadcastd H vpbroadcastd H vpbroadcastd H vpbroadcastd H vpbroadcastd H vpbroadcastd H vpbroadcastd H ' vpbroadcastd H ' vpbroadcastd H ' vpbroadcastd H vpbroadcastd ! vpbroadcastd H vpbroadcastd ! ' vpbroadcastd H ' vpbroadcastd H vpbroadcastd ! vpbroadcastd H vpbroadcastd ! ' vpbroadcastd H ' vpbroadcastd H vpbroadcastd H vpbroadcastd H '
vshufi32x40Shuffle 128-Bit Packed Doubleword Integer Values
vshufi32x4 H :
vshufi32x4 H
vshufi32x4 H ;
vshufi32x4 H
vshufi32x4 H :
vshufi32x4 H
vshufi32x4 H ;
vshufi32x4 H blsrReset Lowest Set Bitblsrl 4 blsrl 4 ' blsfillFill From Lowest Set Bitblsfill 6 blsfill 6 ' vphsubd.Packed Horizontal Subtract Doubleword Integersvphsubd vphsubd / vphsubd ! vphsubd ! 2 vmulsh:Fused Multiply Scalar Half-Precision Floating-Point Valuesvmulsh R vmulsh R $ vmulsh R vmulsh R $ vmulsh R Q vmulsh R Q femmsFast Exit Multimedia Statefemms vpsadbw#Compute Sum of Absolute Differences
vpsadbw vpsadbw I vpsadbw / vpsadbw I / vpsadbw ! vpsadbw I vpsadbw ! 2 vpsadbw I 2 vpsadbw I vpsadbw I 5 vpblendmw*Blend Word Vectors Using an OpMask Control vpblendmw I vpblendmw I / vpblendmw I vpblendmw I 2 vpblendmw I vpblendmw I 5 vpblendmw I vpblendmw I / vpblendmw I vpblendmw I 2 vpblendmw I vpblendmw I 5 maxsd;Return Maximum Scalar Double-Precision Floating-Point ValuemaxsdMAXSD maxsdMAXSD + vmovntdq-Store Double Quadword Using Non-Temporal Hintvmovntdq / vmovntdq H/ vmovntdq 2 vmovntdq H2 vmovntdq H5 psllw#Shift Packed Word Data Left Logicalpsllw psllw psllw + psllw psllw psllw /
vgatherqpdRGather Packed Double-Precision Floating-Point Values Using Signed Quadword Indices
vgatherqpd H D
vgatherqpd H H
vgatherqpd H L
vgatherqpd ! D
vgatherqpd ! H vpinsrbInsert Bytevpinsrb vpinsrb I vpinsrb # vpinsrb I # jnae$Jump if not above or equal (CF == 1)jnaeJCS N jnaeJCS O vminsd;Return Minimum Scalar Double-Precision Floating-Point Valuevminsd H vminsd H + vminsd vminsd H vminsd + vminsd H + vminsd H R vminsd H R vshufps5Shuffle Packed Single-Precision Floating-Point Valuesvshufps H 9 vshufps H vshufps H : vshufps H vshufps H ; vshufps H vshufps H 9 vshufps vshufps H vshufps / vshufps H : vshufps vshufps H vshufps 2 vshufps H ; vshufps H maxpd<Return Maximum Packed Double-Precision Floating-Point ValuesmaxpdMAXPD maxpdMAXPD / pmovzxbw>Move Packed Byte Integers to Word Integers with Zero Extensionpmovzxbw pmovzxbw + vroundsd3Round Scalar Double Precision Floating-Point Valuesvroundsd vroundsd + pcmpgtq$Compare Packed Data for Greater Thanpcmpgtq pcmpgtq / kxnord!Bitwise Logical XNOR 32-bit Maskskxnord I
vgatherpf1dpsoSparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Doubleword Indices Using T1 Hint
vgatherpf1dps LK vpcompresswBStore Sparse Packed Word Integer Values into Dense Memory/Registervpcompressw K0 vpcompressw K vpcompressw K3 vpcompressw K vpcompressw U6 vpcompressw U vpcompressw K vpcompressw K vpcompressw U vpcompressw K/ vpcompressw K2 vpcompressw U5 vpmovsxwqBMove Packed Word Integers to Quadword Integers with Sign Extension vpmovsxwq H vpmovsxwq H vpmovsxwq H vpmovsxwq H ' vpmovsxwq H + vpmovsxwq H / vpmovsxwq vpmovsxwq H vpmovsxwq ' vpmovsxwq H ' vpmovsxwq ! vpmovsxwq H vpmovsxwq ! + vpmovsxwq H + vpmovsxwq H vpmovsxwq H / vpmacsww5Packed Multiply Accumulate Signed Word to Signed Wordvpmacsww " vpmacsww " / vphaddubq/Packed Horizontal Add Unsigned Byte to Quadword vphaddubq " vphaddubq " / vsubph4Subtract Packed Half-Precision Floating-Point Valuesvsubph K < vsubph K vsubph K > vsubph K vsubph R @ vsubph R vsubph K < vsubph K vsubph K > vsubph K vsubph R @ vsubph R vsubph R Q vsubph R Q xgetbv&Get Value of Extended Control Registerxgetbv psrlq(Shift Packed Quadword Data Right Logicalpsrlq psrlq psrlq + psrlq psrlq psrlq / vfixupimmsd;Fix Up Special Scalar Double-Precision Floating-Point Valuevfixupimmsd H vfixupimmsd H + vfixupimmsd H vfixupimmsd H + vfixupimmsd H R vfixupimmsd H R pminsd,Minimum of Packed Signed Doubleword Integerspminsd pminsd / kandw Bitwise Logical AND 16-bit Maskskandw H kshiftldShift Left 32-bit Maskskshiftld I tzcnt&Count the Number of Trailing Zero Bitstzcntw 4 tzcntw 4 $ tzcntl 4 tzcntl 4 ' vrangesdYRange Restriction Calculation For a pair of Scalar Double-Precision Floating-Point Valuesvrangesd J vrangesd J + vrangesd J vrangesd J + vrangesd J R vrangesd J R vpermdPermute Doubleword Integers
vpermd H : vpermd H vpermd H ; vpermd H vpermd H : vpermd ! vpermd H vpermd ! 2 vpermd H ; vpermd H vsubpd6Subtract Packed Double-Precision Floating-Point Valuesvsubpd H = vsubpd H vsubpd H ? vsubpd H vsubpd H A vsubpd H vsubpd H = vsubpd vsubpd H vsubpd / vsubpd H ? vsubpd vsubpd H vsubpd 2 vsubpd H A vsubpd H vsubpd H Q vsubpd H Q vpshrdvw=Concatenate and Variable Shift Packed Word Data Right Logicalvpshrdvw K vpshrdvw K / vpshrdvw K vpshrdvw K 2 vpshrdvw U vpshrdvw U 5 vpshrdvw K vpshrdvw K / vpshrdvw K vpshrdvw K 2 vpshrdvw U vpshrdvw U 5 jnlJump if not less (SF == OF)jnlJGE N jnlJGE O movdiriMOVe to DIRect store Integermovdiri 0' vcvtph2uwZConvert Packed Half-Precision Floating-Point Values to Packed Unsigned Word Integer Values vcvtph2uw K < vcvtph2uw K > vcvtph2uw R @ vcvtph2uw K vcvtph2uw K vcvtph2uw R vcvtph2uw K < vcvtph2uw K vcvtph2uw K > vcvtph2uw K vcvtph2uw R @ vcvtph2uw R vcvtph2uw R Q vcvtph2uw R Q cmovneMove if not equal (ZF == 0)cmovnew cmovnew $ cmovnel cmovnel ' extrq
Extract Fieldextrq extrq vfmsub231sdHFused Multiply-Subtract of Scalar Double-Precision Floating-Point Valuesvfmsub231sd H vfmsub231sd H + vfmsub231sd # vfmsub231sd H vfmsub231sd # + vfmsub231sd H + vfmsub231sd H Q vfmsub231sd H Q vpdpbuudsZPacked Dot Product of Unsigned-by-Unsinged Byte subvectors into Doubleword with Saturation vpdpbuuds X vpdpbuuds X / vpdpbuuds X vpdpbuuds X 2 vpexpandbALoad Sparse Packed Byte Integer Values from Dense Memory/Register vpexpandb K vpexpandb K vpexpandb U vpexpandb K / vpexpandb K 2 vpexpandb U 5 vpexpandb K vpexpandb K / vpexpandb K vpexpandb K 2 vpexpandb U vpexpandb U 5 pmaddubsw9Multiply and Add Packed Signed and Unsigned Byte Integers pmaddubsw pmaddubsw + pmaddubsw pmaddubsw / vfmadd231pdCFused Multiply-Add of Packed Double-Precision Floating-Point Valuesvfmadd231pd H = vfmadd231pd H vfmadd231pd H ? vfmadd231pd H vfmadd231pd H A vfmadd231pd H vfmadd231pd H = vfmadd231pd # vfmadd231pd H vfmadd231pd # / vfmadd231pd H ? vfmadd231pd # vfmadd231pd H vfmadd231pd # 2 vfmadd231pd H A vfmadd231pd H vfmadd231pd H Q vfmadd231pd H Q vphsubw(Packed Horizontal Subtract Word Integersvphsubw vphsubw / vphsubw ! vphsubw ! 2 vscatterpf0qps�Sparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Quadword Indices Using T0 Hint with Intent to Writevscatterpf0qps LM vaesenclast,Perform Last Round of an AES Encryption Flow
vaesenclast vaesenclast K vaesenclast / vaesenclast K / vaesenclast vaesenclast K vaesenclast 2 vaesenclast K 2 vaesenclast H vaesenclast H 5 subSubtractsubbSUBB subbSUBB subbSUBB subbSUBB # subwSUBW subwSUBW subwSUBW subwSUBW subwSUBW $ sublSUBL sublSUBL sublSUBL sublSUBL sublSUBL ' subbSUBB # subbSUBB # subwSUBW $ subwSUBW $ subwSUBW $ sublSUBL ' sublSUBL ' sublSUBL ' movhps7Move High Packed Single-Precision Floating-Point ValuesmovhpsMOVHPS + movhpsMOVHPS + vbcstnebf162ps;Load BF16 Element and Convert to FP32 Element With Broadcasvbcstnebf162ps Z $ vbcstnebf162ps Z $ rorx,Rotate Right Logical Without Affecting Flagsrorxl 5 rorxl 5 ' vcvtps2qq^Convert Packed Single Precision Floating-Point Values to Packed Singed Quadword Integer Values vcvtps2qq J 8 vcvtps2qq J 9 vcvtps2qq J : vcvtps2qq J vcvtps2qq J vcvtps2qq J vcvtps2qq J 8 vcvtps2qq J vcvtps2qq J 9 vcvtps2qq J vcvtps2qq J : vcvtps2qq J vcvtps2qq J Q vcvtps2qq J Q vfmsub132shFFused Multiply-Subtract of Scalar Half-Precision Floating-Point Valuesvfmsub132sh R vfmsub132sh R $ vfmsub132sh R vfmsub132sh R $ vfmsub132sh R Q vfmsub132sh R Q vpabsw&Packed Absolute Value of Word Integersvpabsw I vpabsw I vpabsw I vpabsw I / vpabsw I 2 vpabsw I 5 vpabsw vpabsw I vpabsw / vpabsw I / vpabsw ! vpabsw I vpabsw ! 2 vpabsw I 2 vpabsw I vpabsw I 5 vpmulhw:Multiply Packed Signed Word Integers and Store High Resultvpmulhw I vpmulhw I / vpmulhw I vpmulhw I 2 vpmulhw I vpmulhw I 5 vpmulhw vpmulhw I vpmulhw / vpmulhw I / vpmulhw ! vpmulhw I vpmulhw ! 2 vpmulhw I 2 vpmulhw I vpmulhw I 5 vpopcntb)Packed Population Count for Byte Integersvpopcntb K vpopcntb K vpopcntb S vpopcntb K / vpopcntb K 2 vpopcntb S 5 vpopcntb K vpopcntb K / vpopcntb K vpopcntb K 2 vpopcntb S vpopcntb S 5 pmaddwd,Multiply and Add Packed Signed Word Integerspmaddwd pmaddwd + pmaddwd pmaddwd / vpscatterqq;Scatter Packed Quadword Values with Signed Quadword Indicesvpscatterqq HE vpscatterqq HI vpscatterqq HM vinserti64x21Insert 128 Bits of Packed Quadword Integer Valuesvinserti64x2 J vinserti64x2 J / vinserti64x2 J vinserti64x2 J / vinserti64x2 J vinserti64x2 J / vinserti64x2 J vinserti64x2 J / orps<Bitwise Logical OR of Single-Precision Floating-Point ValuesorpsORPS orpsORPS / vmaskmovdqu'Store Selected Bytes of Double Quadwordvmaskmovdqu vcvtneebf162ps:Convert Even Elements of Packed BF16 Values to FP32 Valuesvcvtneebf162ps Z / vcvtneebf162ps Z 2 movnti(Store Doubleword Using Non-Temporal Hintmovntil ' psrad-Shift Packed Doubleword Data Right Arithmeticpsrad psrad psrad + psrad psrad psrad / vblendmpdLBlend Packed Double-Precision Floating-Point Vectors Using an OpMask Control vblendmpd H = vblendmpd H vblendmpd H ? vblendmpd H vblendmpd H A vblendmpd H vblendmpd H = vblendmpd H vblendmpd H ? vblendmpd H vblendmpd H A vblendmpd H vbroadcastf32x26Broadcast Two Single-Precision Floating-Point Elementsvbroadcastf32x2 J vbroadcastf32x2 J vbroadcastf32x2 J + vbroadcastf32x2 J + vbroadcastf32x2 J vbroadcastf32x2 J + vbroadcastf32x2 J vbroadcastf32x2 J +
vpunpcklwd6Unpack and Interleave Low-Order Words into Doublewords
vpunpcklwd I
vpunpcklwd I /
vpunpcklwd I
vpunpcklwd I 2
vpunpcklwd I
vpunpcklwd I 5
vpunpcklwd
vpunpcklwd I
vpunpcklwd /
vpunpcklwd I /
vpunpcklwd !
vpunpcklwd I
vpunpcklwd ! 2
vpunpcklwd I 2
vpunpcklwd I
vpunpcklwd I 5 vpermwPermute Word Integersvpermw I vpermw I / vpermw I vpermw I 2 vpermw I vpermw I 5 vpermw I vpermw I / vpermw I vpermw I 2 vpermw I vpermw I 5 incIncrement by 1incbINCB incwINCW inclINCL incbINCB # incwINCW $ inclINCL ' vmovdqu16Move Unaligned Word Values vmovdqu16 I0 vmovdqu16 I vmovdqu16 I3 vmovdqu16 I vmovdqu16 I6 vmovdqu16 I vmovdqu16 I / vmovdqu16 I 2 vmovdqu16 I 5 vmovdqu16 I vmovdqu16 I / vmovdqu16 I vmovdqu16 I 2 vmovdqu16 I vmovdqu16 I 5 vmovdqu16 I/ vmovdqu16 I2 vmovdqu16 I5 vbroadcastf64x47Broadcast Four Double-Precision Floating-Point Elementsvbroadcastf64x4 H 2 vbroadcastf64x4 H 2 vpdpwuudsZPacked Dot Product of Unsigned-by-Unsigned Word subvectors into Doubleword with Saturation vpdpwuuds Y vpdpwuuds Y / vpdpwuuds Y vpdpwuuds Y 2
vgf2p8mulbGalois Field Multiply Bytes
vgf2p8mulb
vgf2p8mulb /
vgf2p8mulb
vgf2p8mulb 2
vgf2p8mulb
vgf2p8mulb 5
vgf2p8mulb
vgf2p8mulb
vgf2p8mulb /
vgf2p8mulb /
vgf2p8mulb
vgf2p8mulb
vgf2p8mulb 2
vgf2p8mulb 2
vgf2p8mulb
vgf2p8mulb 5 vgetexppslExtract Exponents of Packed Single-Precision Floating-Point Values as Single-Precision Floating-Point Values vgetexpps H 9 vgetexpps H : vgetexpps H ; vgetexpps H vgetexpps H vgetexpps H vgetexpps H 9 vgetexpps H vgetexpps H : vgetexpps H vgetexpps H ; vgetexpps H vgetexpps H R vgetexpps H R vcvtqq2pdQConvert Packed Quadword Integers to Packed Double-Precision Floating-Point Values vcvtqq2pd J = vcvtqq2pd J ? vcvtqq2pd J A vcvtqq2pd J vcvtqq2pd J vcvtqq2pd J vcvtqq2pd J = vcvtqq2pd J vcvtqq2pd J ? vcvtqq2pd J vcvtqq2pd J A vcvtqq2pd J vcvtqq2pd J Q vcvtqq2pd J Q vpermi2q?Full Permute of Quadwords From Two Tables Overwriting the Indexvpermi2q H = vpermi2q H vpermi2q H ? vpermi2q H vpermi2q H A vpermi2q H vpermi2q H = vpermi2q H vpermi2q H ? vpermi2q H vpermi2q H A vpermi2q H paddusb:Add Packed Unsigned Byte Integers with Unsigned Saturationpaddusb paddusb + paddusb paddusb / vcvtsi2sh7Convert Dword Integer to Scalar Half-Precision FP Value
vcvtsi2shl R
vcvtsi2shl R '
vcvtsi2shl R Q
vgetmantssMExtract Normalized Mantissa from Scalar Single-Precision Floating-Point Value
vgetmantss H
vgetmantss H '
vgetmantss H
vgetmantss H '
vgetmantss H R
vgetmantss H R vmulsd6Multiply Scalar Double-Precision Floating-Point Valuesvmulsd H vmulsd H + vmulsd vmulsd H vmulsd + vmulsd H + vmulsd H Q vmulsd H Q phaddd(Packed Horizontal Add Doubleword Integerphaddd phaddd + phaddd phaddd /
phminposuw3Packed Horizontal Minimum of Unsigned Word Integers
phminposuw
phminposuw / vpsubsw;Subtract Packed Signed Word Integers with Signed Saturationvpsubsw I vpsubsw I / vpsubsw I vpsubsw I 2 vpsubsw I vpsubsw I 5 vpsubsw vpsubsw I vpsubsw / vpsubsw I / vpsubsw ! vpsubsw I vpsubsw ! 2 vpsubsw I 2 vpsubsw I vpsubsw I 5 haddpsPacked Single-FP Horizontal Addhaddps haddps / minss;Return Minimum Scalar Single-Precision Floating-Point ValueminssMINSS minssMINSS ' vpexpandwALoad Sparse Packed Word Integer Values from Dense Memory/Register vpexpandw K vpexpandw K vpexpandw U vpexpandw K / vpexpandw K 2 vpexpandw U 5 vpexpandw K vpexpandw K / vpexpandw K vpexpandw K 2 vpexpandw U vpexpandw U 5 vprotbPacked Rotate Bytesvprotb " vprotb " vprotb " / vprotb " / vprotb " / pmovzxbqBMove Packed Byte Integers to Quadword Integers with Zero Extensionpmovzxbq pmovzxbq $ vprordRotate Packed Doubleword Rightvprord H 9 vprord H : vprord H ; vprord H vprord H vprord H vprord H 9 vprord H vprord H : vprord H vprord H ; vprord H vfrczps7Extract Fraction Packed Single-Precision Floating-Pointvfrczps " vfrczps " / vfrczps " vfrczps " 2 cmpsd5Compare Scalar Double-Precision Floating-Point ValuescmpsdCMPSD cmpsdCMPSD + pfmaxPacked Floating-Point Maximumpfmax pfmax + vplzcntq@Count the Number of Leading Zero Bits for Packed Quadword Valuesvplzcntq N = vplzcntq N ? vplzcntq N A vplzcntq N vplzcntq N vplzcntq N vplzcntq N = vplzcntq N vplzcntq N ? vplzcntq N vplzcntq N A vplzcntq N vcvtsh2ssJConvert Scalar Half-Precision FP Value to Scalar Double-Precision FP Value vcvtsh2ss R vcvtsh2ss R $ vcvtsh2ss R vcvtsh2ss R $ vcvtsh2ss R R vcvtsh2ss R R vfnmadd132psLFused Negative Multiply-Add of Packed Single-Precision Floating-Point Valuesvfnmadd132ps H 9 vfnmadd132ps H vfnmadd132ps H : vfnmadd132ps H vfnmadd132ps H ; vfnmadd132ps H vfnmadd132ps H 9 vfnmadd132ps # vfnmadd132ps H vfnmadd132ps # / vfnmadd132ps H : vfnmadd132ps # vfnmadd132ps H vfnmadd132ps # 2 vfnmadd132ps H ; vfnmadd132ps H vfnmadd132ps H Q vfnmadd132ps H Q vreducessRPerform Reduction Transformation on a Scalar Single-Precision Floating-Point Value vreducess J vreducess J ' vreducess J vreducess J ' vpshrdw4Concatenate and Shift Packed Word Data Right Logicalvpshrdw K vpshrdw K / vpshrdw K vpshrdw K 2 vpshrdw U vpshrdw U 5 vpshrdw K vpshrdw K / vpshrdw K vpshrdw K 2 vpshrdw U vpshrdw U 5 vpmovq2m7Move Signs of Packed Quadword Integers to Mask Registervpmovq2m J vpmovq2m J vpmovq2m J vfmsub213pdHFused Multiply-Subtract of Packed Double-Precision Floating-Point Valuesvfmsub213pd H = vfmsub213pd H vfmsub213pd H ? vfmsub213pd H vfmsub213pd H A vfmsub213pd H vfmsub213pd H = vfmsub213pd # vfmsub213pd H vfmsub213pd # / vfmsub213pd H ? vfmsub213pd # vfmsub213pd H vfmsub213pd # 2 vfmsub213pd H A vfmsub213pd H vfmsub213pd H Q vfmsub213pd H Q sqrtssCCompute Square Root of Scalar Single-Precision Floating-Point ValuesqrtssSQRTSS sqrtssSQRTSS ' vpermt2d?Full Permute of Doublewords From Two Tables Overwriting a Tablevpermt2d H 9 vpermt2d H vpermt2d H : vpermt2d H vpermt2d H ; vpermt2d H vpermt2d H 9 vpermt2d H vpermt2d H : vpermt2d H vpermt2d H ; vpermt2d H vpdpbuudJPacked Dot Product of Unsigned-by-Unsinged Byte subvectors into Doublewordvpdpbuud X vpdpbuud X / vpdpbuud X vpdpbuud X 2 punpcklbw0Unpack and Interleave Low-Order Bytes into Words punpcklbw punpcklbw ' punpcklbw punpcklbw / vphaddswAPacked Horizontal Add Signed Word Integers with Signed Saturationvphaddsw vphaddsw / vphaddsw ! vphaddsw ! 2 vpinsrdInsert Doublewordvpinsrd vpinsrd J vpinsrd ' vpinsrd J ' vxorpd>Bitwise Logical XOR for Double-Precision Floating-Point Valuesvxorpd J = vxorpd J vxorpd J ? vxorpd J vxorpd J A vxorpd J vxorpd J = vxorpd vxorpd J vxorpd / vxorpd J ? vxorpd vxorpd J vxorpd 2 vxorpd J A vxorpd J vpslld)Shift Packed Doubleword Data Left Logicalvpslld H 9 vpslld H : vpslld H ; vpslld H vpslld H vpslld H / vpslld H vpslld H vpslld H / vpslld H vpslld H vpslld H / vpslld H 9 vpslld vpslld H vpslld vpslld H vpslld / vpslld H / vpslld H : vpslld ! vpslld H vpslld ! vpslld H vpslld ! / vpslld H / vpslld H ; vpslld H vpslld H vpslld H / addpd1Add Packed Double-Precision Floating-Point ValuesaddpdADDPD addpdADDPD / vprolvq$Variable Rotate Packed Quadword Leftvprolvq H = vprolvq H vprolvq H ? vprolvq H vprolvq H A vprolvq H vprolvq H = vprolvq H vprolvq H ? vprolvq H vprolvq H A vprolvq H kmovbMove 8-bit Maskkmovb J kmovb J kmovb J # kmovb J kmovb J# movddup Move One Double-FP and Duplicatemovddup movddup + vpmovswbEDown Convert Packed Word Values to Byte Values with Signed Saturationvpmovswb I vpmovswb I, vpmovswb I vpmovswb I0 vpmovswb I vpmovswb I3 vpmovswb I vpmovswb I vpmovswb I vpmovswb I+ vpmovswb I/ vpmovswb I2 vpsignbPacked Sign of Byte Integersvpsignb vpsignb / vpsignb ! vpsignb ! 2 movsldup'Move Packed Single-FP Low and Duplicatemovsldup movsldup /
vcvttph2uwjConvert with Truncation Packed Half-Precision Floating-Point Values to Packed Unsigned Word Integer Values
vcvttph2uw K <
vcvttph2uw K >
vcvttph2uw R @
vcvttph2uw K
vcvttph2uw K
vcvttph2uw R
vcvttph2uw K <
vcvttph2uw K
vcvttph2uw K >
vcvttph2uw K
vcvttph2uw R @
vcvttph2uw R
vcvttph2uw R R
vcvttph2uw R R shld#Integer Double Precision Shift Leftshldw shldw shldl shldl shldw $ shldw $ shldl ' shldl ' vorps<Bitwise Logical OR of Single-Precision Floating-Point Valuesvorps J 9 vorps J vorps J : vorps J vorps J ; vorps J vorps J 9 vorps vorps J vorps / vorps J : vorps vorps J vorps 2 vorps J ; vorps J vfnmadd132pdLFused Negative Multiply-Add of Packed Double-Precision Floating-Point Valuesvfnmadd132pd H = vfnmadd132pd H vfnmadd132pd H ? vfnmadd132pd H vfnmadd132pd H A vfnmadd132pd H vfnmadd132pd H = vfnmadd132pd # vfnmadd132pd H vfnmadd132pd # / vfnmadd132pd H ? vfnmadd132pd # vfnmadd132pd H vfnmadd132pd # 2 vfnmadd132pd H A vfnmadd132pd H vfnmadd132pd H Q vfnmadd132pd H Q rorRotate RightrorbRORB rorbRORB rorbRORB rorwRORW rorwRORW rorwRORW rorlRORL rorlRORL rorlRORL rorbRORB # rorbRORB # rorbRORB # rorwRORW $ rorwRORW $ rorwRORW $ rorlRORL ' rorlRORL ' rorlRORL ' movdquMove Unaligned Double QuadwordmovdquMOVOU movdquMOVOU / movdquMOVOU / pblendwBlend Packed Wordspblendw pblendw / kmovdMove 32-bit Maskkmovd I kmovd I kmovd I ' kmovd I kmovd I' jnbe0Jump if not below or equal (CF == 0 and ZF == 0)jnbeJHI N jnbeJHI O kshiftrbShift Right 8-bit Maskskshiftrb J vsm4key4(Perform Four Rounds of SM4 Key Expansionvsm4key4 vsm4key4 / vsm4key4 vsm4key4 2 vpcomub%Compare Packed Unsigned Byte Integersvpcomub " vpcomub " / vfmaddsub213psXFused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Valuesvfmaddsub213ps H 9 vfmaddsub213ps H vfmaddsub213ps H : vfmaddsub213ps H vfmaddsub213ps H ; vfmaddsub213ps H vfmaddsub213ps H 9 vfmaddsub213ps # vfmaddsub213ps H vfmaddsub213ps # / vfmaddsub213ps H : vfmaddsub213ps # vfmaddsub213ps H vfmaddsub213ps # 2 vfmaddsub213ps H ; vfmaddsub213ps H vfmaddsub213ps H Q vfmaddsub213ps H Q vpminsw&Minimum of Packed Signed Word Integersvpminsw I vpminsw I / vpminsw I vpminsw I 2 vpminsw I vpminsw I 5 vpminsw vpminsw I vpminsw / vpminsw I / vpminsw ! vpminsw I vpminsw ! 2 vpminsw I 2 vpminsw I vpminsw I 5 mwaitMonitor Waitmwait D psubusw?Subtract Packed Unsigned Word Integers with Unsigned SaturationpsubuswPSUBUSW psubuswPSUBUSW + psubuswPSUBUSW psubuswPSUBUSW /
vgatherpf0dpdoSparse Prefetch Packed Double-Precision Floating-Point Data Values with Signed Doubleword Indices Using T0 Hint
vgatherpf0dpd LG ptestPacked Logical Compareptest ptest / vpmaxsd,Maximum of Packed Signed Doubleword Integersvpmaxsd H 9 vpmaxsd H vpmaxsd H : vpmaxsd H vpmaxsd H ; vpmaxsd H vpmaxsd H 9 vpmaxsd vpmaxsd H vpmaxsd / vpmaxsd H : vpmaxsd ! vpmaxsd H vpmaxsd ! 2 vpmaxsd H ; vpmaxsd H vpbroadcastbBroadcast Byte Integervpbroadcastb I vpbroadcastb I vpbroadcastb I vpbroadcastb I vpbroadcastb I vpbroadcastb I vpbroadcastb I # vpbroadcastb I # vpbroadcastb I # vpbroadcastb I vpbroadcastb ! vpbroadcastb I vpbroadcastb ! # vpbroadcastb I # vpbroadcastb I vpbroadcastb ! vpbroadcastb I vpbroadcastb ! # vpbroadcastb I # vpbroadcastb I vpbroadcastb I vpbroadcastb I # kxord Bitwise Logical XOR 32-bit Maskskxord I pauseSpin Loop HintpausePAUSE subpd6Subtract Packed Double-Precision Floating-Point ValuessubpdSUBPD subpdSUBPD / vfnmsubsdQFused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values vfnmsubsd $ vfnmsubsd $ + vfnmsubsd $ + vmaxsd;Return Maximum Scalar Double-Precision Floating-Point Valuevmaxsd H vmaxsd H + vmaxsd vmaxsd H vmaxsd + vmaxsd H + vmaxsd H R vmaxsd H R vpmovsdwKDown Convert Packed Doubleword Values to Word Values with Signed Saturationvpmovsdw H vpmovsdw H, vpmovsdw H vpmovsdw H0 vpmovsdw H vpmovsdw H3 vpmovsdw H vpmovsdw H vpmovsdw H vpmovsdw H+ vpmovsdw H/ vpmovsdw H2 vfnmadd213ssLFused Negative Multiply-Add of Scalar Single-Precision Floating-Point Valuesvfnmadd213ss H vfnmadd213ss H ' vfnmadd213ss # vfnmadd213ss H vfnmadd213ss # ' vfnmadd213ss H ' vfnmadd213ss H Q vfnmadd213ss H Q vcmppd5Compare Packed Double-Precision Floating-Point Valuesvcmppd H = vcmppd H = vcmppd H vcmppd H vcmppd H ? vcmppd H ? vcmppd H vcmppd H vcmppd H A vcmppd H A vcmppd H vcmppd H vcmppd vcmppd / vcmppd vcmppd 2 vcmppd H R vcmppd H R vscalefph[Scale Packed Half-Precision Floating-Point Values With Half-Precision Floating-Point Values vscalefph K < vscalefph K vscalefph K > vscalefph K vscalefph R @ vscalefph R vscalefph K < vscalefph K vscalefph K > vscalefph K vscalefph R @ vscalefph R vscalefph R Q vscalefph R Q
vshufi64x2.Shuffle 128-Bit Packed Quadword Integer Values
vshufi64x2 H ?
vshufi64x2 H
vshufi64x2 H A
vshufi64x2 H
vshufi64x2 H ?
vshufi64x2 H
vshufi64x2 H A
vshufi64x2 H roundsd3Round Scalar Double Precision Floating-Point Valuesroundsd roundsd + vpbroadcastqBroadcast Quadword Integervpbroadcastq H vpbroadcastq H vpbroadcastq H vpbroadcastq H + vpbroadcastq H + vpbroadcastq H + vpbroadcastq ! vpbroadcastq H vpbroadcastq ! + vpbroadcastq H + vpbroadcastq ! vpbroadcastq H vpbroadcastq ! + vpbroadcastq H + vpbroadcastq H vpbroadcastq H + vfnmsubpdQFused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values vfnmsubpd $ vfnmsubpd $ / vfnmsubpd $ / vfnmsubpd $ vfnmsubpd $ 2 vfnmsubpd $ 2 vphadduwd1Packed Horizontal Add Unsigned Word to Doubleword vphadduwd " vphadduwd " / psubsb;Subtract Packed Signed Byte Integers with Signed SaturationpsubsbPSUBSB psubsbPSUBSB + psubsbPSUBSB psubsbPSUBSB / cvtpi2pdBConvert Packed Dword Integers to Packed Double-Precision FP Valuescvtpi2pdCVTPL2PD cvtpi2pdCVTPL2PD + pmovsxwqBMove Packed Word Integers to Quadword Integers with Sign Extensionpmovsxwq pmovsxwq ' punpckldq:Unpack and Interleave Low-Order Doublewords into Quadwords punpckldq punpckldq ' punpckldq punpckldq / adcx9Unsigned Integer Addition of Two Operands with Carry Flagadcxl 7 adcxl 7 ' pmulld?Multiply Packed Signed Doubleword Integers and Store Low Resultpmulld pmulld / vinsertf128#Insert Packed Floating-Point Valuesvinsertf128 vinsertf128 / vpackssdw2Pack Doublewords into Words with Signed Saturation vpackssdw I 9 vpackssdw I vpackssdw I : vpackssdw I vpackssdw I ; vpackssdw I vpackssdw I 9 vpackssdw vpackssdw I vpackssdw / vpackssdw I : vpackssdw ! vpackssdw I vpackssdw ! 2 vpackssdw I ; vpackssdw I vpermt2q=Full Permute of Quadwords From Two Tables Overwriting a Tablevpermt2q H = vpermt2q H vpermt2q H ? vpermt2q H vpermt2q H A vpermt2q H vpermt2q H = vpermt2q H vpermt2q H ? vpermt2q H vpermt2q H A vpermt2q H btrBit Test and ResetbtrwBTRW btrwBTRW btrlBTRL btrlBTRL btrwBTRW $ btrwBTRW $ btrlBTRL ' btrlBTRL ' vrcp28pstApproximation to the Reciprocal of Packed Single-Precision Floating-Point Values with Less Than 2^-28 Relative Errorvrcp28ps M ; vrcp28ps M vrcp28ps M ; vrcp28ps M vrcp28ps M R vrcp28ps M R pcmpistri4Packed Compare Implicit Length Strings, Return Index pcmpistri pcmpistri /
vcvtudq2phZConvert Packed Unsigned Doubleword Integers to Packed Half-Precision Floating-Point Valuesvcvtudq2phx K 9 vcvtudq2phy K :
vcvtudq2ph R ; vcvtudq2phx K vcvtudq2phy K
vcvtudq2ph R vcvtudq2phx K 9 vcvtudq2phy K : vcvtudq2phx K vcvtudq2phy K
vcvtudq2ph R ;
vcvtudq2ph R
vcvtudq2ph R Q
vcvtudq2ph R Q vpmovm2w4Expand Bits of Mask Register to Packed Word Integersvpmovm2w I vpmovm2w I vpmovm2w I packusdw4Pack Doublewords into Words with Unsigned Saturationpackusdw packusdw / aamASCII Adjust AX After MultiplyaamAAM aamAAM vphaddd(Packed Horizontal Add Doubleword Integervphaddd vphaddd / vphaddd ! vphaddd ! 2 movntssKStore Scalar Single-Precision Floating-Point Values Using Non-Temporal Hintmovntss ' vpmacssddQPacked Multiply Accumulate with Saturation Signed Doubleword to Signed Doubleword vpmacssdd " vpmacssdd " / pcmpestri4Packed Compare Explicit Length Strings, Return Index
pcmpestril
pcmpestril / vunpckhpdHUnpack and Interleave High Packed Double-Precision Floating-Point Values vunpckhpd H = vunpckhpd H vunpckhpd H ? vunpckhpd H vunpckhpd H A vunpckhpd H vunpckhpd H = vunpckhpd vunpckhpd H vunpckhpd / vunpckhpd H ? vunpckhpd vunpckhpd H vunpckhpd 2 vunpckhpd H A vunpckhpd H pmullw9Multiply Packed Signed Word Integers and Store Low Resultpmullw pmullw + pmullw pmullw / rdpru$Read Processor Register in User moderdpru . vfnmadd231phJFused Negative Multiply-Add of Packed Half-Precision Floating-Point Valuesvfnmadd231ph K < vfnmadd231ph K vfnmadd231ph K > vfnmadd231ph K vfnmadd231ph R @ vfnmadd231ph R vfnmadd231ph K < vfnmadd231ph K vfnmadd231ph K > vfnmadd231ph K vfnmadd231ph R @ vfnmadd231ph R vfnmadd231ph R Q vfnmadd231ph R Q vpblendmq.Blend Quadword Vectors Using an OpMask Control vpblendmq H = vpblendmq H vpblendmq H ? vpblendmq H vpblendmq H A vpblendmq H vpblendmq H = vpblendmq H vpblendmq H ? vpblendmq H vpblendmq H A vpblendmq H vpdpwssdsVPacked Dot Product of Signed-by-Signed Word subvectors into Doubleword with Saturation vpdpwssds K 9 vpdpwssds K vpdpwssds K : vpdpwssds K vpdpwssds V ; vpdpwssds V vpdpwssds K 9 vpdpwssds W vpdpwssds K vpdpwssds W / vpdpwssds K : vpdpwssds W vpdpwssds K vpdpwssds W 2 vpdpwssds V ; vpdpwssds V
vpunpckldq:Unpack and Interleave Low-Order Doublewords into Quadwords
vpunpckldq H 9
vpunpckldq H
vpunpckldq H :
vpunpckldq H
vpunpckldq H ;
vpunpckldq H
vpunpckldq H 9
vpunpckldq
vpunpckldq H
vpunpckldq /
vpunpckldq H :
vpunpckldq !
vpunpckldq H
vpunpckldq ! 2
vpunpckldq H ;
vpunpckldq H vscalefsd_Scale Scalar Double-Precision Floating-Point Value With a Double-Precision Floating-Point Value vscalefsd H vscalefsd H + vscalefsd H vscalefsd H + vscalefsd H Q vscalefsd H Q
vmaskmovps>Conditional Move Packed Single-Precision Floating-Point Values
vmaskmovps /
vmaskmovps 2
vmaskmovps /
vmaskmovps 2 vpmultishiftqb3Select Packed Unaligned Bytes from Quadword Sourcesvpmultishiftqb K = vpmultishiftqb K vpmultishiftqb K ? vpmultishiftqb K vpmultishiftqb T A vpmultishiftqb T vpmultishiftqb K = vpmultishiftqb K vpmultishiftqb K ? vpmultishiftqb K vpmultishiftqb T A vpmultishiftqb T vfnmadd231pdLFused Negative Multiply-Add of Packed Double-Precision Floating-Point Valuesvfnmadd231pd H = vfnmadd231pd H vfnmadd231pd H ? vfnmadd231pd H vfnmadd231pd H A vfnmadd231pd H vfnmadd231pd H = vfnmadd231pd # vfnmadd231pd H vfnmadd231pd # / vfnmadd231pd H ? vfnmadd231pd # vfnmadd231pd H vfnmadd231pd # 2 vfnmadd231pd H A vfnmadd231pd H vfnmadd231pd H Q vfnmadd231pd H Q setpo Set byte if parity odd (PF == 0)setpoSETPC setpoSETPC # vpextrdExtract Doublewordvpextrd vpextrd J vpextrd ' vpextrd J' vpermqPermute Quadword Integersvpermq H ? vpermq H A vpermq H ? vpermq H vpermq H vpermq H A vpermq H vpermq H vpermq H ? vpermq H ? vpermq ! vpermq H vpermq H vpermq ! 2 vpermq H A vpermq H A vpermq H vpermq H vdivph2Divide Packed Half-Precision Floating-Point Valuesvdivph K < vdivph K vdivph K > vdivph K vdivph R @ vdivph R vdivph K < vdivph K vdivph K > vdivph K vdivph R @ vdivph R vdivph R Q vdivph R Q vfnmsub231ssQFused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Valuesvfnmsub231ss H vfnmsub231ss H ' vfnmsub231ss # vfnmsub231ss H vfnmsub231ss # ' vfnmsub231ss H ' vfnmsub231ss H Q vfnmsub231ss H Q vpextrwExtract Wordvpextrw vpextrw I vpextrw $ vpextrw I$ vfmsub213ssHFused Multiply-Subtract of Scalar Single-Precision Floating-Point Valuesvfmsub213ss H vfmsub213ss H ' vfmsub213ss # vfmsub213ss H vfmsub213ss # ' vfmsub213ss H ' vfmsub213ss H Q vfmsub213ss H Q
vpgatherqd=Gather Packed Doubleword Values Using Signed Quadword Indices
vpgatherqd H D
vpgatherqd H H
vpgatherqd H L
vpgatherqd ! D
vpgatherqd ! H vfmulcphKFused Fused Multiply of Complex Packed Half-Precision Floating-Point Valuesvfmulcph K 9 vfmulcph K vfmulcph K : vfmulcph K vfmulcph R ; vfmulcph R vfmulcph K 9 vfmulcph K vfmulcph K : vfmulcph K vfmulcph R ; vfmulcph R vfmulcph R Q vfmulcph R Q pandnPacked Bitwise Logical AND NOTpandn pandn + pandn pandn / vldmxcsrLoad MXCSR Registervldmxcsr ' movntq)Store of Quadword Using Non-Temporal Hintmovntq
+ kandq Bitwise Logical AND 64-bit Maskskandq I pmovzxdqHMove Packed Doubleword Integers to Quadword Integers with Zero Extensionpmovzxdq pmovzxdq + vpinsrwInsert Wordvpinsrw vpinsrw I vpinsrw $ vpinsrw I $ vpmacswd;Packed Multiply Accumulate Signed Word to Signed Doublewordvpmacswd " vpmacswd " / ktestb"Bit Test 8-bit Masks and Set Flagsktestb J btsBit Test and SetbtswBTSW btswBTSW btslBTSL btslBTSL btswBTSW $ btswBTSW $ btslBTSL ' btslBTSL ' vpermi2b;Full Permute of Bytes From Two Tables Overwriting the Indexvpermi2b T vpermi2b T / vpermi2b T vpermi2b T 2 vpermi2b T vpermi2b T 5 vpermi2b T vpermi2b T / vpermi2b T vpermi2b T 2 vpermi2b T vpermi2b T 5 vscalefss_Scale Scalar Single-Precision Floating-Point Value With a Single-Precision Floating-Point Value vscalefss H vscalefss H ' vscalefss H vscalefss H ' vscalefss H Q vscalefss H Q vdivps4Divide Packed Single-Precision Floating-Point Valuesvdivps H 9 vdivps H vdivps H : vdivps H vdivps H ; vdivps H vdivps H 9 vdivps vdivps H vdivps / vdivps H : vdivps vdivps H vdivps 2 vdivps H ; vdivps H vdivps H Q vdivps H Q pshuflwShuffle Packed Low WordspshuflwPSHUFLW pshuflwPSHUFLW / movdqaMove Aligned Double QuadwordmovdqaMOVO movdqaMOVO / movdqaMOVO /
vpcmpestri4Packed Compare Explicit Length Strings, Return Indexvpcmpestril vpcmpestril / addAddaddbADDB addbADDB addbADDB addbADDB # addwADDW addwADDW addwADDW addwADDW addwADDW $ addlADDL addlADDL addlADDL addlADDL addlADDL ' addbADDB # addbADDB # addwADDW $ addwADDW $ addwADDW $ addlADDL ' addlADDL ' addlADDL ' vfnmsub213ssQFused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Valuesvfnmsub213ss H vfnmsub213ss H ' vfnmsub213ss # vfnmsub213ss H vfnmsub213ss # ' vfnmsub213ss H ' vfnmsub213ss H Q vfnmsub213ss H Q vfnmsub132psQFused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Valuesvfnmsub132ps H 9 vfnmsub132ps H vfnmsub132ps H : vfnmsub132ps H vfnmsub132ps H ; vfnmsub132ps H vfnmsub132ps H 9 vfnmsub132ps # vfnmsub132ps H vfnmsub132ps # / vfnmsub132ps H : vfnmsub132ps # vfnmsub132ps H vfnmsub132ps # 2 vfnmsub132ps H ; vfnmsub132ps H vfnmsub132ps H Q vfnmsub132ps H Q knotwNOT 16-bit Mask Registerknotw H vlddquLoad Unaligned Integer 128 Bitsvlddqu / vlddqu 2 vmovsldup'Move Packed Single-FP Low and Duplicate vmovsldup H vmovsldup H vmovsldup H vmovsldup H / vmovsldup H 2 vmovsldup H 5 vmovsldup vmovsldup H vmovsldup / vmovsldup H / vmovsldup vmovsldup H vmovsldup 2 vmovsldup H 2 vmovsldup H vmovsldup H 5 movhpd6Move High Packed Double-Precision Floating-Point ValuemovhpdMOVHPD + movhpdMOVHPD + vaddph/Add Packed Half-Precision Floating-Point Valuesvaddph K < vaddph K vaddph K > vaddph K vaddph R @ vaddph R vaddph K < vaddph K vaddph K > vaddph K vaddph R @ vaddph R vaddph R Q vaddph R Q cmovnbMove if not below (CF == 0)cmovnbw cmovnbw $ cmovnbl cmovnbl ' kordBitwise Logical OR 32-bit Maskskord I sha1rnds4%Perform Four Rounds of SHA1 Operation sha1rnds4 ( sha1rnds4 ( / vfnmsub132sdQFused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Valuesvfnmsub132sd H vfnmsub132sd H + vfnmsub132sd # vfnmsub132sd H vfnmsub132sd # + vfnmsub132sd H + vfnmsub132sd H Q vfnmsub132sd H Q aaddAtomically ADDaadd ' vdbpsadbw>Double Block Packed Sum-Absolute-Differences on Unsigned Bytes vdbpsadbw I vdbpsadbw I / vdbpsadbw I vdbpsadbw I 2 vdbpsadbw I vdbpsadbw I 5 vdbpsadbw I vdbpsadbw I / vdbpsadbw I vdbpsadbw I 2 vdbpsadbw I vdbpsadbw I 5 sha1msg1NPerform an Intermediate Calculation for the Next Four SHA1 Message Doublewordssha1msg1 ( sha1msg1 ( / vsubps6Subtract Packed Single-Precision Floating-Point Valuesvsubps H 9 vsubps H vsubps H : vsubps H vsubps H ; vsubps H vsubps H 9 vsubps vsubps H vsubps / vsubps H : vsubps vsubps H vsubps 2 vsubps H ; vsubps H vsubps H Q vsubps H Q andnpdHBitwise Logical AND NOT of Packed Double-Precision Floating-Point ValuesandnpdANDNPD andnpdANDNPD / pinsrdInsert DoublewordpinsrdPINSRD pinsrdPINSRD ' vfmaddpsCFused Multiply-Add of Packed Single-Precision Floating-Point Valuesvfmaddps $ vfmaddps $ / vfmaddps $ / vfmaddps $ vfmaddps $ 2 vfmaddps $ 2 vcvtph2wQConvert Packed Half-Precision Floating-Point Values to Packed Word Integer Valuesvcvtph2w K < vcvtph2w K > vcvtph2w R @ vcvtph2w K vcvtph2w K vcvtph2w R vcvtph2w K < vcvtph2w K vcvtph2w K > vcvtph2w K vcvtph2w R @ vcvtph2w R vcvtph2w R Q vcvtph2w R Q
vfpclasssd:Test Class of Scalar Double-Precision Floating-Point Value
vfpclasssd J
vfpclasssd J
vfpclasssd J +
vfpclasssd J +
vgatherpf1qpsmSparse Prefetch Packed Single-Precision Floating-Point Data Values with Signed Quadword Indices Using T1 Hint
vgatherpf1qps LM sha1msg2FPerform a Final Calculation for the Next Four SHA1 Message Doublewordssha1msg2 ( sha1msg2 ( / andLogical ANDandbANDB andbANDB andbANDB andbANDB # andwANDW andwANDW andwANDW andwANDW andwANDW $ andlANDL andlANDL andlANDL andlANDL andlANDL ' andbANDB # andbANDB # andwANDW $ andwANDW $ andwANDW $ andlANDL ' andlANDL ' andlANDL ' jncJump if not carry (CF == 0)jncJCC N jncJCC O pxor#Packed Bitwise Logical Exclusive ORpxorPXOR pxorPXOR + pxorPXOR pxorPXOR / unpcklpdGUnpack and Interleave Low Packed Double-Precision Floating-Point ValuesunpcklpdUNPCKLPD unpcklpdUNPCKLPD / paddsw6Add Packed Signed Word Integers with Signed Saturationpaddsw paddsw + paddsw paddsw / mcommit
Memory COMMITmcommit >