Expand description
Model Quantization
Techniques for reducing model size and improving inference speed through quantization to lower precision formats.
Structs§
- Dynamic
Quantization - Dynamic quantization
- Quantization
Aware Training - Quantization-aware training (QAT)
- Quantization
Config - Quantization configuration
- Quantized
Tensor - Quantized tensor representation
Enums§
- Quantization
Scheme - Quantization scheme