Expand description
BitNet b1.58 ternary quantization codec.
Encodes FP32 vectors into ternary trits {-1, 0, +1} using per-vector
mean-absolute-value (absmean) scaling. Two storage layouts are supported:
- Cold (
QuantMode::TernaryPacked): 5 trits/byte via base-3 packing (1.6 bpw). Optimal for disk; decompressed on page-in. - Hot (
QuantMode::TernarySimd): 2 bpw, 4 trits/byte, suitable for direct SIMD load.VectorCodec::encodeproduces hot format by default.
Re-exports§
pub use codec::TernaryCodec;pub use codec::TernaryQuantized;pub use codec::TernaryQuery;pub use packing::cold_to_hot;pub use packing::pack_cold;pub use packing::pack_hot;pub use packing::quantize;pub use packing::unpack_cold;pub use packing::unpack_hot;pub use simd::ternary_dot;
Modules§
- codec
TernaryCodec— BitNet b1.58 ternaryVectorCodecimplementation.- packing
- Trit packing/unpacking and FP32→ternary quantization.
- simd
- SIMD ternary dot product with runtime CPU-feature dispatch.