Skip to main content

Module quantization

Module quantization 

Source
Expand description

Quantization modules for BitNet.

This module provides:

  • Weight quantization: AbsMean -> {-1, 0, +1}
  • Activation quantization: Per-token AbsMax -> INT8

Structs§

QuantizedActivations
Quantized activations with per-token scales.
TernaryWeight
Ternary weight representation with per-group scales.

Functions§

dequantize_activations
Dequantize INT8 activations back to float tensor.
dequantize_weights
Dequantize ternary weights back to float tensor.
quantize_activations
Quantize activations using per-token AbsMax scaling to INT8.
quantize_ste
Apply quantization in a differentiable way using Straight-Through Estimator.
quantize_weights
Quantize a weight tensor to ternary using AbsMean quantization.