pub struct QLoraLinear { /* private fields */ }Expand description
QLoRA: Quantized LoRA for memory-efficient fine-tuning.
The base weight matrix is quantized to 4-bit NF4 format with group-wise quantization, while LoRA matrices A and B remain in full fp32 precision. This dramatically reduces memory usage for the frozen base weights.
Implementations§
Source§impl QLoraLinear
impl QLoraLinear
Sourcepub fn from_weight(
weight: Array2<f32>,
rank: usize,
alpha: f32,
group_size: usize,
) -> ModelResult<Self>
pub fn from_weight( weight: Array2<f32>, rank: usize, alpha: f32, group_size: usize, ) -> ModelResult<Self>
Create a QLoRA layer from a full-precision weight matrix.
The weight is quantized to 4-bit NF4 format with the given group size. LoRA matrices are initialized as in standard LoRA (A=Kaiming, B=zeros).
Sourcepub fn dequantize_weight(&self) -> ModelResult<Array2<f32>>
pub fn dequantize_weight(&self) -> ModelResult<Array2<f32>>
Dequantize the weight matrix back to full precision.
This is an approximate reconstruction — quantization is lossy.
Sourcepub fn forward(&self, input: &Array1<f32>) -> ModelResult<Array1<f32>>
pub fn forward(&self, input: &Array1<f32>) -> ModelResult<Array1<f32>>
Forward pass: dequantize weight, compute W @ x + scaling * B @ (A @ x)
Sourcepub fn memory_saved_bytes(&self) -> usize
pub fn memory_saved_bytes(&self) -> usize
Memory saved compared to storing full fp32 weights, in bytes.
Sourcepub fn trainable_params(&self) -> usize
pub fn trainable_params(&self) -> usize
Number of trainable parameters (LoRA A and B)
Sourcepub fn group_size(&self) -> usize
pub fn group_size(&self) -> usize
Get the quantization group size
Sourcepub fn out_features(&self) -> usize
pub fn out_features(&self) -> usize
Get output features
Sourcepub fn in_features(&self) -> usize
pub fn in_features(&self) -> usize
Get input features
Sourcepub fn zero_point(&self) -> &Array1<f32>
pub fn zero_point(&self) -> &Array1<f32>
Get the per-group zero points
Trait Implementations§
Source§impl Clone for QLoraLinear
impl Clone for QLoraLinear
Source§fn clone(&self) -> QLoraLinear
fn clone(&self) -> QLoraLinear
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl Freeze for QLoraLinear
impl RefUnwindSafe for QLoraLinear
impl Send for QLoraLinear
impl Sync for QLoraLinear
impl Unpin for QLoraLinear
impl UnsafeUnpin for QLoraLinear
impl UnwindSafe for QLoraLinear
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more