Skip to main content

ElementKind

Enum ElementKind 

Source
pub enum ElementKind {
Show 17 variants F16, Bf16, F32, F32Strict, F64, S8, U8, I32, I64, Bool, Fp8E4M3, Fp8E5M2, S4, U4, Bin, Complex32, Complex64,
}
Expand description

Runtime tag for an Element or IntElement.

Unified across the float and integer kernel families so that a single kernel-SKU descriptor can describe any baracuda kernel.

Variants§

§

F16

IEEE 754 binary16.

§

Bf16

Brain-float 16.

§

F32

IEEE 754 binary32 inputs reduced through TF32 tensor cores (10-bit mantissa). Maps to the f32 Rust type.

§

F32Strict

IEEE 754 binary32 inputs reduced through SIMT CUDA cores at full f32 precision. Maps to the F32Strict wrapper type. Bit-stable on the same hardware.

§

F64

IEEE 754 binary64. Maps to the f64 Rust type.

§

S8

Signed 8-bit integer. Maps to the S8 wrapper type. Routed through Ampere int8 tensor cores (mma.sync m16n8k32 integer variant) with int32 accumulation; float alpha / beta let the kernel act as a dequantize-in-epilogue.

§

U8

Unsigned 8-bit integer. Maps to the U8 wrapper type. Same kernel family as S8 with unsigned operands.

§

I32

Signed 32-bit integer. Maps to the i32 Rust type via the Element impl. Two roles:

  1. Accumulator marker for integer GEMM SKUs (reported by crate::PrecisionGuarantee::accumulator).
  2. Input element for elementwise integer arithmetic (bitwise / comparison / scan ops). The same plan shapes used for floating-point inputs extend to i32 via the Element impl.
§

I64

Signed 64-bit integer. Maps to the i64 Rust type via the Element impl. Used as an input element for the elementwise integer arithmetic family (bitwise / comparison / scan ops). PyTorch’s default integer tensor dtype.

§

Bool

Boolean (1-byte storage). Maps to the Bool wrapper type via the Element impl. Used as the input element for the logical- op family (logical_and / logical_or / logical_xor) and as the output element for the comparison-op family (eq / ne / gt / ge / lt / le). Truthiness convention follows PyTorch: 0 = false, any non-zero byte = true.

§

Fp8E4M3

8-bit floating-point, E4M3 encoding (1 sign + 4 exponent + 3 mantissa, bias 7, max-finite 448, no infinities). Maps to the Fp8E4M3 wrapper type. Routed through Ada / Hopper FP8 tensor cores (mma.sync m16n8k32 FP8 variant) with F32 accumulation.

§

Fp8E5M2

8-bit floating-point, E5M2 encoding (1 sign + 5 exponent + 2 mantissa, bias 15, IEEE-754-compatible inf / NaN). Maps to the Fp8E5M2 wrapper type. Same FP8 tensor-core path as Fp8E4M3 with the alternate operand tag (.e5m2.e5m2.f32).

§

S4

Signed 4-bit integer — packed-pair storage. Maps to the S4 wrapper type. Routed through Ada Lovelace int4 tensor cores (mma.sync.aligned.m16n8k64.row.col.satfinite.s32.s4.s4.s32) with int32 accumulation; float alpha / beta let the kernel act as a dequantize-in-epilogue (same convention as the int8 family).

§

U4

Unsigned 4-bit integer — packed-pair storage. Maps to the U4 wrapper type. Same kernel family as S4 with the alternate operand tag (.u4.u4.s32).

§

Bin

1-bit binary — packed-byte storage (8 bits per byte, LSB = lowest K index). Maps to the Bin wrapper type. Routed through Ampere+ binary tensor cores (mma.sync.aligned.m16n8k256.row.col.s32.b1.b1.s32.xor.popc). Distinct programming model: the output is the raw popcount accumulator (s32), not a re-quantized b1.

§

Complex32

Single-precision complex — interleaved real/imag pair of f32 (#[repr(C)]). Maps to the Complex32 wrapper type. Used by the FFT op family (Milestone 6.4) for spectrum-domain tensors. ABI-compatible with cuFFT’s cufftComplex, NumPy’s complex64, and PyTorch’s torch.complex64.

§

Complex64

Double-precision complex — interleaved real/imag pair of f64. Maps to the Complex64 wrapper type. ABI-compatible with cuFFT’s cufftDoubleComplex, NumPy’s complex128, and PyTorch’s torch.complex128.

Trait Implementations§

Source§

impl Clone for ElementKind

Source§

fn clone(&self) -> ElementKind

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Copy for ElementKind

Source§

impl Debug for ElementKind

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl Eq for ElementKind

Source§

impl Hash for ElementKind

Source§

fn hash<__H>(&self, state: &mut __H)
where __H: Hasher,

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl PartialEq for ElementKind

Source§

fn eq(&self, other: &ElementKind) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for ElementKind

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.