#[repr(C)]pub struct BlockQ2K {
pub scales: [u8; 16],
pub qs: [u8; 64],
pub d: f16,
pub dmin: f16,
}Expand description
Q2_K super-block: 256 weights quantized to 2 bits each.
Layout (84 bytes):
scales: 16 bytes — packed 4-bit scale/min pairs for 16 sub-blocks of 16 weights. Each byte holds two 4-bit values: low nibble = scale, high nibble = min.qs: 64 bytes — 256 x 2-bit quantized weights (4 per byte, LSB first).d: FP16 super-block scale.dmin: FP16 super-block minimum.
Dequant: w[i] = d * sub_scale * q[i] - dmin * sub_min
Fields§
§scales: [u8; 16]Packed 4-bit scale/min pairs for 16 sub-blocks.
qs: [u8; 64]256 x 2-bit quantized weights, 4 per byte.
d: f16Super-block scale (FP16).
dmin: f16Super-block minimum (FP16).
Implementations§
Source§impl BlockQ2K
impl BlockQ2K
Sourcepub fn dequant(blocks: &[Self], output: &mut [f32]) -> BonsaiResult<()>
pub fn dequant(blocks: &[Self], output: &mut [f32]) -> BonsaiResult<()>
Dequantize a slice of Q2_K blocks into f32 output.
output must have length blocks.len() * QK_K.
Sourcepub fn quantize(input: &[f32]) -> BonsaiResult<Vec<Self>>
pub fn quantize(input: &[f32]) -> BonsaiResult<Vec<Self>>
Quantize f32 input into Q2_K blocks.
Input length must be a multiple of QK_K (256).
Sourcepub fn dequant_row_to_buf(blocks_for_row: &[Self], buf: &mut Vec<f32>)
pub fn dequant_row_to_buf(blocks_for_row: &[Self], buf: &mut Vec<f32>)
Dequantize a single row’s worth of Q2_K blocks into a pre-allocated buffer.
buf will be extended by blocks_for_row.len() * 256 elements.
Sourcepub fn slice_from_bytes(data: &[u8]) -> BonsaiResult<&[Self]>
pub fn slice_from_bytes(data: &[u8]) -> BonsaiResult<&[Self]>
Zero-copy cast of a byte slice to a slice of BlockQ2K.
Returns error if length is not a multiple of BLOCK_Q2_K_BYTES (84)
or if the pointer is not properly aligned.
Trait Implementations§
impl Copy for BlockQ2K
impl StructuralPartialEq for BlockQ2K
Auto Trait Implementations§
impl Freeze for BlockQ2K
impl RefUnwindSafe for BlockQ2K
impl Send for BlockQ2K
impl Sync for BlockQ2K
impl Unpin for BlockQ2K
impl UnsafeUnpin for BlockQ2K
impl UnwindSafe for BlockQ2K
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more