Expand description
Lossy conversion from floating point to a smaller integer type with a fixed range.
Quantize an f64 to a byte and back again
use numquant::{Quantize, Quantized, U8};
let original = 500.0;
// Quantize the value into a byte.
// Quantization supports inputs between 0 and 1000.
let quantized = Quantized::<U8<0, 1000>>::from_f64(original);
// Convert it back to an f64
let dequantized = quantized.to_f64();
// The conversion isn't lossless, but the dequantized value is close to the original:
approx::assert_abs_diff_eq!(original, dequantized, epsilon = U8::<0, 1000>::max_error());Modules
For quantizing values linearly to a range.
Structs
Quantizes/dequantizes to a value between 0 and Q_MAX stored in type T.
The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.
Contains a quantized value.
Traits
Trait for quantizing and dequantizing values.
Type Definitions
Quantizes/dequantizes to a value stored in an u8, using the full range of the u8.
The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.
Quantizes/dequantizes to a value stored in an u16, using the full range of the u16.
The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.
Quantizes/dequantizes to a value stored in an u32, using the full range of the u32.
The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.