pub struct QuantizedLinear { /* private fields */ }Expand description
Quantized Linear Layer
Performs matrix multiplication in INT8: output = (input @ weights^T + bias) * scale
Implementations§
Source§impl QuantizedLinear
impl QuantizedLinear
Sourcepub fn from_fp32(linear: &Linear, input_scale: f32) -> Self
pub fn from_fp32(linear: &Linear, input_scale: f32) -> Self
Create from FP32 Linear layer
§Arguments
linear- FP32 linear layer to quantizeinput_scale- Expected input activation scale
Sourcepub fn forward_int8(
&self,
input: &[u8],
batch_size: usize,
input_scale: f32,
input_zero_point: u8,
) -> CnnResult<Tensor>
pub fn forward_int8( &self, input: &[u8], batch_size: usize, input_scale: f32, input_zero_point: u8, ) -> CnnResult<Tensor>
Forward pass with INT8 computation
§Arguments
input- Quantized u8 input tensor [batch, in_features]batch_size- Batch sizeinput_scale- Input quantization scaleinput_zero_point- Input quantization zero point
Sourcepub fn in_features(&self) -> usize
pub fn in_features(&self) -> usize
Get input features
Sourcepub fn out_features(&self) -> usize
pub fn out_features(&self) -> usize
Get output features
Trait Implementations§
Source§impl Clone for QuantizedLinear
impl Clone for QuantizedLinear
Source§fn clone(&self) -> QuantizedLinear
fn clone(&self) -> QuantizedLinear
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for QuantizedLinear
impl RefUnwindSafe for QuantizedLinear
impl Send for QuantizedLinear
impl Sync for QuantizedLinear
impl Unpin for QuantizedLinear
impl UnsafeUnpin for QuantizedLinear
impl UnwindSafe for QuantizedLinear
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more