Skip to main content

Module quantize

Module quantize 

Source
Expand description

INT8 Quantization Module (ADR-091)

This module provides comprehensive INT8 quantization support for CNN models:

  • Phase 1 (params, tensor): Core quantization infrastructure (NEW)
  • Phase 2 (calibration): Histogram-based range estimation
  • Phase 3 (graph_rewrite): BatchNorm fusion, zero-point optimization, Q/DQ insertion
  • Phase 4: Kernel Dispatch - Runtime selection of optimized INT8 kernels

§ADR-091 Phase 1 Components (New)

  • params: Quantization parameters (scale, zero_point, qmin, qmax)
  • tensor: Quantized tensor types with metadata
  • Enhanced calibration: CalibrationCollector with MinMax, Percentile, MSE, Entropy methods

Re-exports§

pub use params::QuantizationParams as QuantParams;
pub use params::QuantizationScheme;
pub use params::QuantizationMode;
pub use tensor::QuantizedTensor;
pub use tensor::QuantizationMetadata;
pub use calibration::CalibrationHistogram;
pub use calibration::QuantizationParams;
pub use calibration::Quantizer;
pub use graph_rewrite::ComputationGraph;
pub use graph_rewrite::GraphNode;
pub use graph_rewrite::NodeParams;
pub use graph_rewrite::NodeType;
pub use graph_rewrite::fuse_batchnorm_to_conv;
pub use graph_rewrite::fuse_relu;
pub use graph_rewrite::fuse_hardswish;
pub use graph_rewrite::fuse_zp_to_bias;
pub use graph_rewrite::generate_hardswish_lut;
pub use graph_rewrite::insert_qdq_nodes;

Modules§

calibration
Calibration and Quantization Parameters (ADR-091 Phase 2)
graph_rewrite
Graph Rewrite Passes for INT8 Quantization (ADR-091 Phase 3)
params
Quantization parameters for INT8 quantization.
tensor
Quantized tensor types with metadata.