pub struct QuantizedWeights {
pub name: String,
pub weights_i8: Vec<i32>,
pub shape: Vec<usize>,
pub scale: f32,
pub bias: Option<Vec<f32>>,
}Expand description
INT8 quantised weights for a single Linear layer.
Fields§
§name: StringLayer name.
weights_i8: Vec<i32>Quantised weight values in row-major order (i8 serialised as i32 for serde compatibility).
shape: Vec<usize>Weight tensor shape [out_features, in_features].
scale: f32Per-tensor scale factor.
bias: Option<Vec<f32>>Bias (kept in FP32).
Implementations§
Source§impl QuantizedWeights
impl QuantizedWeights
Sourcepub fn from_f32(
name: String,
w: &[f32],
shape: Vec<usize>,
bias: Option<Vec<f32>>,
scale: Option<f32>,
) -> Self
pub fn from_f32( name: String, w: &[f32], shape: Vec<usize>, bias: Option<Vec<f32>>, scale: Option<f32>, ) -> Self
Quantise an FP32 weight matrix.
§Arguments
name– Layer name.w– FP32 weights in row-major layout.shape–[out_features, in_features].bias– Optional FP32 bias vector.scale– IfNone, computed from the max absolute weight.
Sourcepub fn dequantize(&self) -> Vec<f32>
pub fn dequantize(&self) -> Vec<f32>
Dequantise to FP32.
Sourcepub fn size_bytes(&self) -> usize
pub fn size_bytes(&self) -> usize
Memory used by the quantised weights in bytes.
Sourcepub fn original_size_bytes(&self) -> usize
pub fn original_size_bytes(&self) -> usize
Memory the original FP32 weights would have used.
Sourcepub fn compression_ratio(&self) -> f32
pub fn compression_ratio(&self) -> f32
Compression ratio vs. FP32.
Trait Implementations§
Source§impl Clone for QuantizedWeights
impl Clone for QuantizedWeights
Source§fn clone(&self) -> QuantizedWeights
fn clone(&self) -> QuantizedWeights
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for QuantizedWeights
impl Debug for QuantizedWeights
Source§impl<'de> Deserialize<'de> for QuantizedWeights
impl<'de> Deserialize<'de> for QuantizedWeights
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for QuantizedWeights
impl RefUnwindSafe for QuantizedWeights
impl Send for QuantizedWeights
impl Sync for QuantizedWeights
impl Unpin for QuantizedWeights
impl UnsafeUnpin for QuantizedWeights
impl UnwindSafe for QuantizedWeights
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more