Expand description
Trit packing/unpacking and FP32→ternary quantization.
Cold (5 trits/byte, base-3) is the disk-friendly format. Hot (4 trits/byte, 2-bpw) is the SIMD-friendly format.
Functions§
- cold_
to_ hot - Convert cold-packed trits to hot-packed trits.
- pack_
cold - Pack trits (
i8 ∈ {-1, 0, +1}) into cold 5-trits-per-byte format. - pack_
hot - Pack trits into hot 2-bpw format (4 trits per byte).
- quantize
- Quantize a FP32 vector to ternary trits using BitNet absmean scaling.
- unpack_
cold - Unpack cold 5-trits-per-byte format back to
i8trits. - unpack_
hot - Unpack hot 2-bpw format back to
i8trits.