#[non_exhaustive]pub enum QuantizationLevel {
None,
Float16,
BFloat16,
Int8,
Int4,
}Expand description
Model weight quantisation levels.
§Examples
use ai_hwaccel::QuantizationLevel;
let q = QuantizationLevel::Int8;
assert_eq!(q.bits_per_param(), 8);
assert!((q.memory_reduction_factor() - 4.0).abs() < f64::EPSILON);Variants (Non-exhaustive)§
This enum is marked as non-exhaustive
Non-exhaustive enums could have additional variants added in future. Therefore, when matching against variants of non-exhaustive enums, an extra wildcard arm must be added to account for any future variants.
None
Full precision — FP32, 32 bits per parameter.
Float16
Half precision — FP16, 16 bits per parameter.
BFloat16
Brain floating point — BF16, 16 bits per parameter.
Int8
8-bit integer quantisation.
Int4
4-bit integer quantisation (GPTQ / AWQ style).
Implementations§
Source§impl QuantizationLevel
impl QuantizationLevel
Sourcepub fn bits_per_param(&self) -> u32
pub fn bits_per_param(&self) -> u32
Number of bits used per model parameter.
Sourcepub fn memory_reduction_factor(&self) -> f64
pub fn memory_reduction_factor(&self) -> f64
Memory reduction factor relative to FP32.
Trait Implementations§
Source§impl Clone for QuantizationLevel
impl Clone for QuantizationLevel
Source§fn clone(&self) -> QuantizationLevel
fn clone(&self) -> QuantizationLevel
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for QuantizationLevel
impl Debug for QuantizationLevel
Source§impl<'de> Deserialize<'de> for QuantizationLevel
impl<'de> Deserialize<'de> for QuantizationLevel
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Source§impl Display for QuantizationLevel
impl Display for QuantizationLevel
Source§impl Hash for QuantizationLevel
impl Hash for QuantizationLevel
Source§impl PartialEq for QuantizationLevel
impl PartialEq for QuantizationLevel
Source§impl Serialize for QuantizationLevel
impl Serialize for QuantizationLevel
Source§impl TryFrom<u32> for QuantizationLevel
impl TryFrom<u32> for QuantizationLevel
impl Copy for QuantizationLevel
impl Eq for QuantizationLevel
impl StructuralPartialEq for QuantizationLevel
Auto Trait Implementations§
impl Freeze for QuantizationLevel
impl RefUnwindSafe for QuantizationLevel
impl Send for QuantizationLevel
impl Sync for QuantizationLevel
impl Unpin for QuantizationLevel
impl UnsafeUnpin for QuantizationLevel
impl UnwindSafe for QuantizationLevel
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more