#[repr(i32)]pub enum Algorithm {
Show 13 variants
AlgUnset = 0,
AlgDotAnyF8AnyF8F32 = 1,
AlgDotAnyF8AnyF8F32FastAccum = 2,
AlgDotF16F16F16 = 3,
AlgDotF16F16F32 = 4,
AlgDotBf16Bf16Bf16 = 5,
AlgDotBf16Bf16F32 = 6,
AlgDotBf16Bf16F32X3 = 7,
AlgDotBf16Bf16F32X6 = 8,
AlgDotTf32Tf32F32 = 9,
AlgDotTf32Tf32F32X3 = 10,
AlgDotF32F32F32 = 11,
AlgDotF64F64F64 = 12,
}Expand description
The algorithm used to evaluate the instruction.
The naming convention for the dot instruction is ALG_DOT_{A_TYPE}{B_TYPE}{ACCUM_TYPE}[_X{NUM_OPS}] where A_TYPE, B_TYPE and ACCUM_TYPE correspond to the types in the “primitive dot operations” (such as TensorCore operations) and NUM_OPS is the number of such operations used per “primitive tile”. When the NUM_OPS field is skipped, it is assumed to be 1. The types mentioned in the name are independent of the storage types.
In general ATYPE and BTYPE are the precisions that the LHS and RHS of the operation are rounded to and ACCUMTYPE is the accumulation type. If a backend does not support the given algorithm, an error is raised. The Algorithm enum is intended to eventually replace the Precision enum.
Variants§
AlgUnset = 0
If the algorithm is ALG_UNSET, we will decide the algorithm based on
the operand_precision values (for now).
AlgDotAnyF8AnyF8F32 = 1
The storage type can be any 8-bit floating point type.
AlgDotAnyF8AnyF8F32FastAccum = 2
The storage type can be any 8-bit floating point type. Intermediate results will not periodically be promoted to a higher precision. This corresponds to CUBLASLT_MATMUL_DESC_FAST_ACCUM. Triton’s maxNumImpreciseAcc=32 setting may be similar.
AlgDotF16F16F16 = 3
AlgDotF16F16F32 = 4
AlgDotBf16Bf16Bf16 = 5
AlgDotBf16Bf16F32 = 6
AlgDotBf16Bf16F32X3 = 7
An algorithm which uses 3 BF16_BF16_F32 matmuls to achieve better precision.
AlgDotBf16Bf16F32X6 = 8
An algorithm which uses 6 BF16_BF16_F32 matmuls to achieve better precision (similar to F32).
AlgDotTf32Tf32F32 = 9
AlgDotTf32Tf32F32X3 = 10
An algorithm which uses 3 TF32_TF32_F32 matmuls to achieve better precision (similar to F32).
AlgDotF32F32F32 = 11
AlgDotF64F64F64 = 12
Implementations§
source§impl Algorithm
impl Algorithm
sourcepub fn as_str_name(&self) -> &'static str
pub fn as_str_name(&self) -> &'static str
String value of the enum field names used in the ProtoBuf definition.
The values are not transformed in any way and thus are considered stable (if the ProtoBuf definition does not change) and safe for programmatic use.
sourcepub fn from_str_name(value: &str) -> Option<Self>
pub fn from_str_name(value: &str) -> Option<Self>
Creates an enum from field names used in the ProtoBuf definition.
Trait Implementations§
source§impl Ord for Algorithm
impl Ord for Algorithm
source§impl PartialOrd for Algorithm
impl PartialOrd for Algorithm
impl Copy for Algorithm
impl Eq for Algorithm
impl StructuralPartialEq for Algorithm
Auto Trait Implementations§
impl Freeze for Algorithm
impl RefUnwindSafe for Algorithm
impl Send for Algorithm
impl Sync for Algorithm
impl Unpin for Algorithm
impl UnwindSafe for Algorithm
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit)