Module quantization

Expand description

Core quantization logic for INT8 and INT4.

Provides tensor-level quantization (per-tensor and per-channel), INT4 bit-packing, and the high-level Quantizer that combines a QuantConfig with optional calibration statistics.

Structs§

Int4Range: Marker for INT4 quantization (-8 … 7).
Int8Range: Marker for INT8 quantization (-128 … 127).
QuantConfig: Configuration for a quantization pass.
QuantParamsGeneric: Affine quantization parameters (scale and zero-point), generic over bit-width.
QuantizedTensorGeneric: Generic quantized tensor, parameterized by bit-width marker.
Quantizer: High-level quantizer that combines configuration with optional calibration.

Enums§

QuantizedTensorType: Type-erased wrapper over QuantizedTensor (INT8) and QuantizedTensorInt4 (INT4).

Traits§

QuantRange: Marker trait that supplies the clamp constants for a quantization bit-width.

Functions§

pack_int4: Pack a slice of INT4 values (two per byte, high nibble first).
unpack_int4: Unpack INT4 values from packed bytes, returning exactly num_values i8s.

Type Aliases§

QuantParams: INT8 affine quantization parameters — clamp(-128, 127).
QuantParamsInt4: INT4 affine quantization parameters — clamp(-8, 7).
QuantizedTensor: An INT8 quantized tensor with optional per-channel parameters.
QuantizedTensorInt4: An INT4 quantized tensor with optional per-channel parameters and bit packing.

Module quantization

Module quantization Copy item path

Structs§

Enums§

Traits§

Functions§

Type Aliases§

Module quantization