Skip to main content

Module quantization

Module quantization 

Source
Expand description

Advanced quantization support for model compression and acceleration.

This module provides comprehensive quantization capabilities including:

  • Multiple quantization schemes (INT8, INT4, FP8, binary)
  • Quantization-Aware Training (QAT) support
  • Post-Training Quantization (PTQ) with calibration
  • Per-channel and per-tensor quantization
  • Symmetric and asymmetric quantization
  • Dynamic and static quantization modes
  • Quantization simulation for accuracy validation

Structs§

CalibrationStats
Statistics collected during calibration.
FakeQuantize
Fake quantization for QAT (simulates quantization during training).
NodeId
Node identifier (0-based index into graph.nodes).
QuantizationConfig
Quantization configuration for a graph or model.
QuantizationParams
Quantization parameters for a tensor.
QuantizationSummary
Summary of quantization results.
Quantizer
Quantizer for converting graphs to quantized representations.

Enums§

CalibrationStrategy
Calibration strategy for post-training quantization.
QuantizationError
Quantization-related errors.
QuantizationGranularity
Quantization granularity.
QuantizationMode
Quantization mode.
QuantizationSymmetry
Quantization symmetry mode.
QuantizationType
Quantization data types.