Expand description

Lossy conversion from floating point to a smaller integer type with a fixed range.

Quantize an f64 to a byte and back again

use numquant::{Quantize, Quantized, U8};
let original = 500.0;
// Quantize the value into a byte.
// Quantization supports inputs between 0 and 1000.
let quantized = Quantized::<U8<0, 1000>>::from_f64(original);
// Convert it back to an f64
let dequantized = quantized.to_f64();
// The conversion isn't lossless, but the dequantized value is close to the original:
approx::assert_abs_diff_eq!(original, dequantized, epsilon = U8::<0, 1000>::max_error());

Modules

For quantizing values linearly to a range.

Structs

Quantizes/dequantizes to a value between 0 and Q_MAX stored in type T. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.

Contains a quantized value.

Traits

Trait for quantizing and dequantizing values.

Type Definitions

Quantizes/dequantizes to a value stored in an u8, using the full range of the u8. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.

Quantizes/dequantizes to a value stored in an u16, using the full range of the u16. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.

Quantizes/dequantizes to a value stored in an u32, using the full range of the u32. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.