pub struct PendingIsqLayer { /* private fields */ }Expand description
A wrapper around a QuantMethod that resolves lazily from a background
quantization task. Created by apply_immediate_isq when a thread pool is
available for parallel immediate ISQ.
Implementations§
Trait Implementations§
Source§impl Debug for PendingIsqLayer
impl Debug for PendingIsqLayer
Source§impl QuantMethod for PendingIsqLayer
impl QuantMethod for PendingIsqLayer
fn new(_method: QuantMethodConfig) -> Result<Self>where
Self: Sized,
fn dequantize_w(&self) -> Result<Tensor>
Source§fn forward(&self, a: &Tensor) -> Result<Tensor>
fn forward(&self, a: &Tensor) -> Result<Tensor>
Compute matmul of
self and a. self should contain the weights.Source§fn forward_autocast(&self, a: &Tensor) -> Result<Tensor>
fn forward_autocast(&self, a: &Tensor) -> Result<Tensor>
Compute matmul of
self and a. self should contain the weights.
Automatically cast to required quantization activation type and backSource§fn gather_forward_autocast(
&self,
a: &Tensor,
indices: &Tensor,
) -> Result<Tensor>
fn gather_forward_autocast( &self, a: &Tensor, indices: &Tensor, ) -> Result<Tensor>
Compute matmul of
self and a. self should contain the weights.
Automatically cast to required quantization activation type and back. Read moreSource§fn quantized_act_type(&self) -> Option<DType>
fn quantized_act_type(&self) -> Option<DType>
If a quantized method, return the activation dtype.
Source§fn dtype_and_device(&self) -> (DType, Device)
fn dtype_and_device(&self) -> (DType, Device)
Weight dtype and device
Source§fn add_delta_w(&self, delta: &Tensor) -> Result<Arc<dyn QuantMethod>>
fn add_delta_w(&self, delta: &Tensor) -> Result<Arc<dyn QuantMethod>>
Add a delta weight from LoRA to the weights. This should be prescaled with alpha.
Source§fn apply_isq(
self: Arc<Self>,
dtype: Option<IsqType>,
device: Device,
n_quantized: &AtomicUsize,
imatrix_weight: Option<Vec<f32>>,
guard: QuantizeOntoGuard,
) -> Result<Arc<dyn QuantMethod>>
fn apply_isq( self: Arc<Self>, dtype: Option<IsqType>, device: Device, n_quantized: &AtomicUsize, imatrix_weight: Option<Vec<f32>>, guard: QuantizeOntoGuard, ) -> Result<Arc<dyn QuantMethod>>
If the quant is backed by a qmatmul.
fn unquant_weight_bias(&self) -> Option<(Tensor, Option<Tensor>)>
Source§fn begin_track_stats(&mut self) -> Result<()>
fn begin_track_stats(&mut self) -> Result<()>
Begin tracking stats into an ImatrixLayerStats
Source§fn end_track_stats(&self) -> Result<Tensor>
fn end_track_stats(&self) -> Result<Tensor>
End tracking stats into an ImatrixLayerStats. Returns the computed imatrix.
fn is_distributed(&self) -> Option<DistributedKind>
Source§impl QuantizedSerde for PendingIsqLayer
impl QuantizedSerde for PendingIsqLayer
fn name(&self) -> &'static str
fn isq_serde_supported(&self) -> bool
fn serialize(&self) -> Result<Cow<'_, [u8]>>
Source§fn serialize_with_bias(&self, bias: Option<Tensor>) -> Result<Cow<'_, [u8]>>
fn serialize_with_bias(&self, bias: Option<Tensor>) -> Result<Cow<'_, [u8]>>
NOT meant for external calling
fn deserialize(
_data: Cow<'_, [u8]>,
_device: &Device,
_comm: &Arc<Comm>,
_guard: QuantizeOntoGuard,
) -> Result<Arc<dyn QuantMethod>>where
Self: Sized,
fn deserialize_ext_bias(
_data: Cow<'_, [u8]>,
_device: &Device,
_guard: QuantizeOntoGuard,
) -> Result<(Arc<dyn QuantMethod>, Option<Tensor>)>where
Self: Sized,
Auto Trait Implementations§
impl !Freeze for PendingIsqLayer
impl RefUnwindSafe for PendingIsqLayer
impl Send for PendingIsqLayer
impl Sync for PendingIsqLayer
impl Unpin for PendingIsqLayer
impl UnsafeUnpin for PendingIsqLayer
impl UnwindSafe for PendingIsqLayer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more