Expand description
§ToRSh Quantization Library
A comprehensive quantization library for deep learning tensor operations, providing state-of-the-art quantization algorithms, configuration management, performance metrics, and utility functions.
§Key Features
- Multiple Quantization Schemes: INT8, INT4, binary, ternary, group-wise quantization
- Advanced Observers: MinMax, Histogram, Percentile, MovingAverage calibration
- Backend Support: Native, FBGEMM, QNNPACK for optimized execution
- Comprehensive Metrics: PSNR, SNR, compression ratio analysis
- Configuration Tools: Builder patterns, validation, JSON serialization
- Utility Functions: Batch processing, error diagnostics, auto-calibration
§Architecture
The library is organized into specialized modules:
- config: Configuration types and builder patterns
- algorithms: Core quantization and dequantization algorithms
- observers: Calibration system for parameter estimation
- specialized: Advanced algorithms (INT4, binary, ternary, group-wise)
- metrics: Performance analysis and benchmarking tools
- utils: Utility functions for validation, batch processing, and reporting
§Quick Start
use torsh_quantization::{QuantConfig, quantize_with_config};
use torsh_tensor::creation::tensor_1d;
// Create a simple quantization configuration
let config = QuantConfig::int8();
// Create a tensor to quantize
let data = vec![0.0, 1.0, 2.0, 3.0];
let tensor = tensor_1d(&data).unwrap();
// Quantize the tensor
let (quantized, scale, zero_point) = quantize_with_config(&tensor, &config).unwrap();§Advanced Usage
§Custom Configuration
use torsh_quantization::{QuantConfig, ObserverType, QuantBackend};
let config = QuantConfig::int8()
.with_observer(ObserverType::Histogram)
.with_backend(QuantBackend::Fbgemm);§Batch Processing
use torsh_quantization::{quantize_batch_consistent, QuantConfig};
use torsh_tensor::creation::tensor_1d;
let tensor1 = tensor_1d(&[0.0, 1.0, 2.0]).unwrap();
let tensor2 = tensor_1d(&[1.0, 2.0, 3.0]).unwrap();
let tensor3 = tensor_1d(&[2.0, 3.0, 4.0]).unwrap();
let tensors = vec![&tensor1, &tensor2, &tensor3];
let config = QuantConfig::int8();
let results = quantize_batch_consistent(&tensors, &config).unwrap();§Performance Analysis
use torsh_quantization::{compare_quantization_configs, QuantConfig};
use torsh_tensor::creation::tensor_1d;
let tensor = tensor_1d(&[0.0, 1.0, 2.0, 3.0]).unwrap();
let configs = vec![
QuantConfig::int8(),
QuantConfig::int4(),
QuantConfig::per_channel(0),
];
let comparison = compare_quantization_configs(&tensor, &configs).unwrap();§Export Support
The library supports exporting quantized models to various formats:
- ONNX: Industry-standard format for cross-platform deployment
- TensorRT: NVIDIA’s high-performance inference engine
- TensorFlow Lite: Mobile and edge deployment
- Core ML: Apple’s machine learning framework
- Custom formats: Extensible architecture for new backends
Re-exports§
pub use config::*;pub use algorithms::*;pub use observers::*;pub use specialized::*;pub use metrics::*;pub use analysis::*;pub use utils::*;pub use memory_pool::*;pub use simd_ops::*;pub use quantum::*;pub use quantum_enhanced::*;pub use benchmarks::*;
Modules§
- algorithms
- Core quantization algorithms and tensor operations
- analysis
- Analysis tools for quantization
- benchmarks
- Comprehensive Benchmark Suite for Quantization
- config
- Quantization configuration types and builders
- memory_
pool - Memory Pool Management for Quantization
- metrics
- Quantization quality metrics and analysis tools
- observers
- Observer implementations for quantization parameter calibration
- quantum
- Quantum-Inspired Quantization Techniques
- quantum_
enhanced - Enhanced Quantum-Inspired Quantization with Advanced Algorithms
- simd_
ops - SIMD-accelerated quantization operations
- specialized
- Specialized quantization algorithms for advanced use cases
- utils
- Quantization utilities and helper functions
Structs§
- Tensor
- The main Tensor type for ToRSh
Enums§
- DType
- Supported data types for tensors
- Torsh
Error - Main ToRSh error enum - unified interface to all error types
Type Aliases§
- Torsh
Result - Result type alias for ToRSh operations