Expand description
Quantization modules for BitNet.
This module provides:
- Weight quantization: AbsMean -> {-1, 0, +1}
- Activation quantization: Per-token AbsMax -> INT8
Structs§
- Quantized
Activations - Quantized activations with per-token scales.
- Ternary
Weight - Ternary weight representation with per-group scales.
Functions§
- dequantize_
activations - Dequantize INT8 activations back to float tensor.
- dequantize_
weights - Dequantize ternary weights back to float tensor.
- quantize_
activations - Quantize activations using per-token AbsMax scaling to INT8.
- quantize_
ste - Apply quantization in a differentiable way using Straight-Through Estimator.
- quantize_
weights - Quantize a weight tensor to ternary using AbsMean quantization.