Expand description
Quantization utilities for model weights
Functionsยง
- quantize_
embeddings - Quantize embedding weights
- quantize_
linear_ weights - Quantize linear layer weights (output channel quantization)
- quantize_
weights - Quantize model weights to INT8
- quantize_
weights_ per_ channel - Quantize model weights with per-channel quantization