Skip to main content

Module weight_quantization

Module weight_quantization 

Source
Expand description

Quantization utilities for model weights

Functionsยง

quantize_embeddings
Quantize embedding weights
quantize_linear_weights
Quantize linear layer weights (output channel quantization)
quantize_weights
Quantize model weights to INT8
quantize_weights_per_channel
Quantize model weights with per-channel quantization