Module quantization

Module quantization 

Source
Expand description

Model Quantization

Techniques for reducing model size and improving inference speed through quantization to lower precision formats.

Structs§

DynamicQuantization
Dynamic quantization
QuantizationAwareTraining
Quantization-aware training (QAT)
QuantizationConfig
Quantization configuration
QuantizedTensor
Quantized tensor representation

Enums§

QuantizationScheme
Quantization scheme