#[repr(transparent)]pub struct Bin(pub u8);Expand description
1-bit binary element marker — packed-byte storage.
#[repr(transparent)] around u8. One Bin storage slot is one
byte and holds eight packed b1 elements: bit i of the byte
(LSB = bit 0) is the element at K offset 8 * byte_idx + i. Packing
is along the K axis for A/B operands.
A DeviceBuffer<u8> of (M*K)/8 bytes can be reinterpreted as a
DeviceBuffer<Bin> of (M*K)/8 storage slots via view_as without
copying.
Routes through Ampere+ binary tensor cores
(mma.sync.aligned.m16n8k256.row.col.s32.b1.b1.s32.xor.popc) with
an S32 output accumulator. Unlike the int4 / int8 / FP8
families, bin GEMM does not quantize its output back to the
input element type — the result is the raw popcount accumulator
(popcount(xor(A_row, B_col)) summed over K bytes), surfaced as
i32. No α / β / bias / activation chain (the popcount programming
model doesn’t have a meaningful place for them).
The plan layer ([Bin is consumed by BinGemmPlan in
baracuda-kernels) takes M, N, K in element counts and
leading dimensions in storage-slot (= byte) counts —
MatrixRef<Bin>::ld therefore equals K / 8 for row-major A with
no padding. K must be divisible by 8 (packing is byte-aligned).
Tuple Fields§
§0: u8Implementations§
Trait Implementations§
impl BinElement for Bin
impl Copy for Bin
impl DeviceRepr for Bin
impl Eq for Bin
Source§impl KernelDtype for Bin
impl KernelDtype for Bin
Source§const KIND: ElementKind = ElementKind::Bin
const KIND: ElementKind = ElementKind::Bin
crate::KernelSku::element.