Skip to main content

Module packing

Module packing 

Source
Expand description

Trit packing/unpacking and FP32→ternary quantization.

Cold (5 trits/byte, base-3) is the disk-friendly format. Hot (4 trits/byte, 2-bpw) is the SIMD-friendly format.

Functions§

cold_to_hot
Convert cold-packed trits to hot-packed trits.
pack_cold
Pack trits (i8 ∈ {-1, 0, +1}) into cold 5-trits-per-byte format.
pack_hot
Pack trits into hot 2-bpw format (4 trits per byte).
quantize
Quantize a FP32 vector to ternary trits using BitNet absmean scaling.
unpack_cold
Unpack cold 5-trits-per-byte format back to i8 trits.
unpack_hot
Unpack hot 2-bpw format back to i8 trits.