Expand description
Quantization support for model compression and acceleration (Phase 11) モデル圧縮・高速化のための量子化サポート(フェーズ11) Quantization support for RusTorch - Phase 11 Implementation RusTorch用量子化サポート - フェーズ11実装
This module provides comprehensive quantization support for deep learning models, enabling efficient inference and training with reduced precision arithmetic.
このモジュールは深層学習モデルの包括的な量子化サポートを提供し、 精度を下げた算術演算による効率的な推論と学習を可能にします。
§Key Features
§Dynamic Quantization
- Runtime quantization of weights and activations
- Automatic calibration using statistical observers
- Per-tensor and per-channel quantization schemes
§Static Quantization
- Pre-calibrated quantization parameters
- Optimal for deployment scenarios
- Hardware-accelerated operations
§Quantization-Aware Training (QAT)
- Training with quantization simulation
- Straight-through estimators for gradients
- Fine-tuning of quantized models
§Hardware Optimization
- CPU SIMD optimizations for quantized operations
- CUDA kernels for GPU acceleration
- Metal Performance Shaders for Apple Silicon
§Quantization Schemes
§Symmetric Quantization
quantized = round(fp32_value / scale) + zero_point
dequantized = (quantized - zero_point) * scale§Asymmetric Quantization
quantized = round(fp32_value / scale)
dequantized = quantized * scale §Usage Examples
use rustorch::quantization::{QuantizedTensor, QuantizationScheme, StaticQuantizer, TensorQuantization};
use rustorch::tensor::Tensor;
// Dynamic quantization
let tensor: Tensor<f32> = Tensor::randn(&[128, 256]);
let quantized = tensor.quantize_dynamic(QuantizationScheme::Symmetric)?;
// Static quantization with calibration
let mut quantizer = StaticQuantizer::<f32>::new();
quantizer.calibrate(QuantizationScheme::Symmetric)?;Re-exports§
pub use calibration::HistogramObserver;pub use calibration::MinMaxObserver;pub use calibration::Observer;pub use calibration::StaticQuantizer;pub use hardware::optimized_ops;pub use operations::DequantizeOps;pub use operations::QuantizedOps;pub use qat::FakeQuantize;pub use qat::QATConv2d;pub use qat::QATLinear;pub use qat::QATModule;pub use schemes::AsymmetricQuantization;pub use schemes::QuantizationScheme;pub use schemes::SymmetricQuantization;pub use types::QuantizationType;pub use types::QuantizedTensor;
Modules§
- calibration
- Calibration and statistical observation キャリブレーションと統計観測 Calibration and statistical observation for quantization 量子化のためのキャリブレーションと統計観測
- hardware
- Hardware-specific optimizations ハードウェア固有最適化 Hardware-specific optimizations for quantized operations 量子化演算のハードウェア固有最適化
- observers
- Statistical observers for calibration キャリブレーション用統計観測器 Statistical observers for quantization calibration 量子化キャリブレーション用統計観測器
- operations
- Quantized tensor operations 量子化テンソル演算 Quantized tensor operations and arithmetic 量子化テンソル演算と算術
- qat
- Quantization-aware training support 量子化認識学習サポート Quantization-Aware Training (QAT) support 量子化認識学習(QAT)サポート
- schemes
- Quantization schemes and algorithms 量子化スキームとアルゴリズム Quantization schemes and algorithms 量子化スキームとアルゴリズム
- types
- Quantized tensor data types and core structures 量子化テンソルデータ型とコア構造 Quantized tensor types and data structures 量子化テンソル型とデータ構造
Structs§
- Quant
Param Calculator - Unified quantization parameter calculator 統一量子化パラメータ計算器
- Quantization
Config - Global quantization configuration グローバル量子化設定
Traits§
- Quantizable
- Trait for quantizable data types 量子化可能なデータ型のトレイト
- Tensor
Quantization - Main quantization API for tensors テンソル用メイン量子化API