Skip to main content

quantize_4bit

Function quantize_4bit 

Source
pub fn quantize_4bit(values: &[f32]) -> Quantized4Bit
Expand description

Quantize f32 values to 4-bit with block-wise scaling

§Arguments

  • values - Input f32 values

§Returns

Quantized4Bit structure with scales and packed 4-bit data