pub struct Nf4Quantizer {
pub block_size: usize,
}Expand description
NF4 quantizer — encodes tensors to packed 4-bit NF4 codes.
Blocks of block_size elements share an absmax scaling factor.
The default block_size of 64 matches the QLoRA paper.
Fields§
§block_size: usizeNumber of elements per absmax scaling block.
Implementations§
Source§impl Nf4Quantizer
impl Nf4Quantizer
Sourcepub fn encode(&self, tensor: &[f32]) -> QuantResult<(Vec<u8>, Vec<f32>)>
pub fn encode(&self, tensor: &[f32]) -> QuantResult<(Vec<u8>, Vec<f32>)>
Encode a flat tensor to packed NF4 bytes and per-block absmax values.
The number of elements must be a multiple of block_size, and
block_size must be even (pairs of nibbles pack into bytes).
Returns (packed_bytes, absmax_per_block).
§Errors
QuantError::GroupSizeMismatch— iflenis not divisible byblock_size.QuantError::EmptyInput— iftensoris empty.
Sourcepub fn decode(&self, packed: &[u8], absmaxs: &[f32]) -> QuantResult<Vec<f32>>
pub fn decode(&self, packed: &[u8], absmaxs: &[f32]) -> QuantResult<Vec<f32>>
Decode packed NF4 bytes back to f32 using the stored absmax values.
§Errors
QuantError::DimensionMismatch— if packed/absmax lengths are inconsistent.
Sourcepub fn quantization_mse(&self, tensor: &[f32]) -> QuantResult<f32>
pub fn quantization_mse(&self, tensor: &[f32]) -> QuantResult<f32>
Trait Implementations§
Source§impl Clone for Nf4Quantizer
impl Clone for Nf4Quantizer
Source§fn clone(&self) -> Nf4Quantizer
fn clone(&self) -> Nf4Quantizer
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for Nf4Quantizer
impl Debug for Nf4Quantizer
Auto Trait Implementations§
impl Freeze for Nf4Quantizer
impl RefUnwindSafe for Nf4Quantizer
impl Send for Nf4Quantizer
impl Sync for Nf4Quantizer
impl Unpin for Nf4Quantizer
impl UnsafeUnpin for Nf4Quantizer
impl UnwindSafe for Nf4Quantizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more