Skip to main content

quantize_activations

Function quantize_activations 

Source
pub fn quantize_activations(
    activations: &Tensor,
    config: &BitNetConfig,
) -> Result<QuantizedActivations>
Expand description

Quantize activations using per-token AbsMax scaling to INT8.

§Algorithm

For each token (row):

  1. Compute scale = max(|X|) / 127
  2. Compute X_q = round(X / scale) clamped to [-127, 127]

§Arguments

  • activations - Input tensor [batch, seq_len, hidden_dim] or [batch, hidden_dim]
  • config - BitNet configuration

§Errors

Returns error if quantization fails.