pub struct FakeQuantize {
pub bits: u32,
pub symmetric: bool,
pub scale: f32,
pub zero_point: i32,
pub enabled: bool,
}Expand description
Fake quantization operator for quantization-aware training (QAT).
Maintains the current scale and zero-point that are updated during calibration / training via an associated observer.
Fields§
§bits: u32Quantization bit-width.
symmetric: boolWhether to use symmetric quantization (zp = 0).
scale: f32Current quantization scale (must be > 0).
zero_point: i32Current zero-point.
enabled: boolWhether fake quantization is enabled.
When disabled, forward returns the input unchanged.
Implementations§
Source§impl FakeQuantize
impl FakeQuantize
Sourcepub fn new(
bits: u32,
symmetric: bool,
scale: f32,
zero_point: i32,
) -> QuantResult<Self>
pub fn new( bits: u32, symmetric: bool, scale: f32, zero_point: i32, ) -> QuantResult<Self>
Create a new fake quantizer with the given scale and zero-point.
§Errors
QuantError::InvalidBitWidth—bitsis 0 or > 16.QuantError::InvalidScale—scaleis ≤ 0 or non-finite.
Sourcepub fn with_defaults(bits: u32, symmetric: bool) -> QuantResult<Self>
pub fn with_defaults(bits: u32, symmetric: bool) -> QuantResult<Self>
Create with default scale=1.0, zp=0 for the given bit-width.
§Errors
QuantError::InvalidBitWidth—bitsis 0 or > 16.
Sourcepub fn update_params(&mut self, scale: f32, zero_point: i32) -> QuantResult<()>
pub fn update_params(&mut self, scale: f32, zero_point: i32) -> QuantResult<()>
Update scale and zero-point (e.g., from an observer).
§Errors
QuantError::InvalidScale—scaleis ≤ 0 or non-finite.
Sourcepub fn quant_range(&self) -> (i32, i32)
pub fn quant_range(&self) -> (i32, i32)
Integer quantization bounds [q_min, q_max].
Sourcepub fn float_range(&self) -> (f32, f32)
pub fn float_range(&self) -> (f32, f32)
Float clipping bounds [x_min, x_max] corresponding to the integer range.
Sourcepub fn forward(&self, x: &[f32]) -> Vec<f32>
pub fn forward(&self, x: &[f32]) -> Vec<f32>
Forward pass: quantize-then-dequantize.
If enabled = false, returns the input unchanged.
Sourcepub fn backward(&self, grad_output: &[f32], x: &[f32]) -> QuantResult<Vec<f32>>
pub fn backward(&self, grad_output: &[f32], x: &[f32]) -> QuantResult<Vec<f32>>
Backward pass (Straight-Through Estimator).
Passes grad_output through where x is inside the representable
float range; zeros the gradient where x is clipped.
§Errors
QuantError::DimensionMismatch—grad_outputandxlengths differ.
Sourcepub fn quantization_noise(&self, x: &[f32]) -> f32
pub fn quantization_noise(&self, x: &[f32]) -> f32
Estimate quantization noise (MSE between input and fake-quantized output).
Useful for measuring quantization error at the current scale/zp.
Trait Implementations§
Source§impl Clone for FakeQuantize
impl Clone for FakeQuantize
Source§fn clone(&self) -> FakeQuantize
fn clone(&self) -> FakeQuantize
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more