Crate numquant

Source
Expand description

Lossy conversion from floating point to a smaller integer type with a fixed range.

§Quantize an f64 to a byte and back again

use numquant::{Quantize, Quantized, U8};
let original = 500.0;
// Quantize the value into a byte.
// Quantization supports inputs between 0 and 1000.
let quantized = Quantized::<U8<0, 1000>>::from_f64(original);
// Convert it back to an f64
let dequantized = quantized.to_f64();
// The conversion isn't lossless, but the dequantized value is close to the original:
approx::assert_abs_diff_eq!(original, dequantized, epsilon = U8::<0, 1000>::max_error());

Modules§

linear
For quantizing values linearly to a range.

Structs§

IntRange
Quantizes/dequantizes to a value between 0 and Q_MAX stored in type T. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.
Quantized
Contains a quantized value.

Traits§

Quantize
Trait for quantizing and dequantizing values.

Type Aliases§

U8
Quantizes/dequantizes to a value stored in an u8, using the full range of the u8. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.
U16
Quantizes/dequantizes to a value stored in an u16, using the full range of the u16. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.
U32
Quantizes/dequantizes to a value stored in an u32, using the full range of the u32. The range for the unquantized value is between MIN and MAX. Values outside of this are clamped.