Skip to main content

Module bfloat16

Module bfloat16 

Source
Expand description

BFloat16 (bf16) floating-point support

Implements the Google Brain bfloat16 format used extensively in ML training. BF16 has the same exponent range as f32 (8 bits) but reduced mantissa (7 bits), making it ideal for training where range matters more than precision.

Layout: 1 sign bit, 8 exponent bits, 7 mantissa bits. Range: same as f32 (±3.4×10³⁸), precision: ~2 decimal digits.

Structs§

BFloat16
BFloat16 — Google Brain’s 16-bit floating-point format.

Functions§

bf16_dot
Dot product of two bf16 slices, accumulated in f32.
bf16_gemm
Mixed-precision GEMM: C = A * B with bf16 inputs and f32 accumulation. A is (m × k), B is (k × n), C is (m × n).
bf16_gemv
Matrix-vector multiply: y = A * x, with bf16 inputs and f32 accumulation. A is (rows × cols) row-major, x is (cols,), y is (rows,).
bf16_to_f32_slice
Convert a bf16 slice to f32.
f32_to_bf16_slice
Convert an f32 slice to bf16.