Skip to main content

quantize_matrix_cache_friendly

Function quantize_matrix_cache_friendly 

Source
pub fn quantize_matrix_cache_friendly(
    matrix: &[f32],
    rows: usize,
    cols: usize,
    scale: f32,
    zero_point: i32,
    output: &mut [f32],
    cache_params: &CacheAwareParams,
) -> Result<()>
Expand description

Cache-friendly matrix quantization using tiling for 2D tensors