Expand description
INT8 Quantization Module (ADR-091)
This module provides comprehensive INT8 quantization support for CNN models:
- Phase 1 (params, tensor): Core quantization infrastructure (NEW)
- Phase 2 (calibration): Histogram-based range estimation
- Phase 3 (graph_rewrite): BatchNorm fusion, zero-point optimization, Q/DQ insertion
- Phase 4: Kernel Dispatch - Runtime selection of optimized INT8 kernels
§ADR-091 Phase 1 Components (New)
params: Quantization parameters (scale, zero_point, qmin, qmax)tensor: Quantized tensor types with metadata- Enhanced
calibration: CalibrationCollector with MinMax, Percentile, MSE, Entropy methods
Re-exports§
pub use params::QuantizationParams as QuantParams;pub use params::QuantizationScheme;pub use params::QuantizationMode;pub use tensor::QuantizedTensor;pub use tensor::QuantizationMetadata;pub use calibration::CalibrationHistogram;pub use calibration::QuantizationParams;pub use calibration::Quantizer;pub use graph_rewrite::ComputationGraph;pub use graph_rewrite::GraphNode;pub use graph_rewrite::NodeParams;pub use graph_rewrite::NodeType;pub use graph_rewrite::fuse_batchnorm_to_conv;pub use graph_rewrite::fuse_relu;pub use graph_rewrite::fuse_hardswish;pub use graph_rewrite::fuse_zp_to_bias;pub use graph_rewrite::generate_hardswish_lut;pub use graph_rewrite::insert_qdq_nodes;
Modules§
- calibration
- Calibration and Quantization Parameters (ADR-091 Phase 2)
- graph_
rewrite - Graph Rewrite Passes for INT8 Quantization (ADR-091 Phase 3)
- params
- Quantization parameters for INT8 quantization.
- tensor
- Quantized tensor types with metadata.