pub fn quantize_unorm(v: f32, n: i32) -> i32
Expand description

Quantizes a float in [0..1] range into an n-bit fixed point unorm value.

Assumes reconstruction function q / (2^n-1), which is the case for fixed-function normalized fixed point conversion.

Maximum reconstruction error: 1/2^(n+1)