pub struct QuantizedWeight { /* private fields */ }Expand description
A quantized weight tensor loaded into Metal GPU buffers.
Tracks the tensor name, logical shape, original dtype, quantization parameters, and the Metal buffers holding the packed data, scales, and optional biases.
§Layout
packed_data— Packed quantized integers (e.g. 4-bit values packed 8-per-uint32, or 6-bit values packed 4-per-uint32).scales— Per-group scale factors as f16 values.biases— Per-group biases as f16 values (present for affine quant).
Implementations§
Source§impl QuantizedWeight
impl QuantizedWeight
Sourcepub fn new(
tensor_name: String,
shape: Vec<usize>,
dtype: DType,
bits: u8,
group_size: usize,
scales: MlxBuffer,
biases: Option<MlxBuffer>,
packed_data: MlxBuffer,
) -> Self
pub fn new( tensor_name: String, shape: Vec<usize>, dtype: DType, bits: u8, group_size: usize, scales: MlxBuffer, biases: Option<MlxBuffer>, packed_data: MlxBuffer, ) -> Self
Construct a new QuantizedWeight with all fields specified.
This is the primary constructor used by load_quantized_weights.
It does not validate buffer sizes — the caller is responsible for
ensuring the buffers match the declared shape, bits, and group_size.
Sourcepub fn tensor_name(&self) -> &str
pub fn tensor_name(&self) -> &str
Full tensor name path.
Sourcepub fn group_size(&self) -> usize
pub fn group_size(&self) -> usize
Quantization group size.
Sourcepub fn packed_data(&self) -> &MlxBuffer
pub fn packed_data(&self) -> &MlxBuffer
Borrow the packed quantized data buffer.
Sourcepub fn element_count(&self) -> usize
pub fn element_count(&self) -> usize
Number of logical elements in the weight tensor (product of shape dims).
Sourcepub fn num_groups(&self) -> usize
pub fn num_groups(&self) -> usize
Number of quantization groups along the last dimension.
This is ceil(last_dim / group_size).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for QuantizedWeight
impl RefUnwindSafe for QuantizedWeight
impl Send for QuantizedWeight
impl Sync for QuantizedWeight
impl Unpin for QuantizedWeight
impl UnsafeUnpin for QuantizedWeight
impl UnwindSafe for QuantizedWeight
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more