Module compression

Source
Expand description

Model compression module Model compression utilities for neural networks

This module provides tools for model compression including:

  • Quantization (post-training and quantization-aware training)
  • Pruning (magnitude-based, structured, and unstructured)
  • Knowledge distillation
  • Model compression analysis and optimization

Structs§

AccuracyMetrics
Accuracy measurement metrics
CalibrationStatistics
Statistics collected during calibration
CompressionAnalyzer
Model compression analyzer
CompressionReport
Comprehensive compression analysis report
ModelPruner
Neural network pruner
PostTrainingQuantizer
Post-training quantization manager
QuantizationParams
Quantization parameters for a tensor
SparsityStatistics
Sparsity statistics for a layer
SpeedMetrics
Speed measurement metrics

Enums§

CalibrationMethod
Quantization calibration method
PruningMethod
Pruning method
QuantizationBits
Quantization precision levels
QuantizationScheme
Quantization scheme
StructuredGranularity
Structured pruning granularity