Module quantization

Module quantization 

Source
Expand description

Quantization Support for Model Compression

This module provides quantization techniques to compress knowledge graph embeddings by reducing precision from float32 to int8/int4, significantly reducing model size and improving inference speed.

Structs§

ModelQuantizer
Model quantizer
QuantizationConfig
Quantization configuration
QuantizationParams
Quantization parameters
QuantizationStats
Quantization statistics
QuantizedTensor
Quantized tensor representation

Enums§

BitWidth
Quantization bit width
QuantizationScheme
Quantization scheme