#[repr(u8)]pub enum QuantScheme {
QInt8 = 1,
QInt4 = 2,
Binary = 3,
FP16Passthrough = 4,
}Expand description
Quantization scheme for embedding vectors
Variants§
QInt8 = 1
8-bit signed integer quantization
- Range: -128 to 127
- Size reduction: 4x (F32 → Int8)
- Accuracy: Very high
QInt4 = 2
4-bit packed quantization (future)
- Range: -8 to 7
- Size reduction: 8x (F32 → 4-bit)
- Accuracy: High
Binary = 3
1-bit binary quantization (future)
- Range: -1 or 1 (sign-based)
- Size reduction: 32x (F32 → 1-bit)
- Accuracy: Moderate
FP16Passthrough = 4
FP16 passthrough (future)
- Range: Half-precision float
- Size reduction: 2x (F32 → F16)
- Accuracy: Very high
Implementations§
Source§impl QuantScheme
impl QuantScheme
Sourcepub fn bytes_per_value(self) -> usize
pub fn bytes_per_value(self) -> usize
Returns the expected bytes per value for this quantization scheme
Sourcepub fn compression_ratio(self) -> f32
pub fn compression_ratio(self) -> f32
Returns the compression ratio compared to F32 (4 bytes)
Trait Implementations§
Source§impl Clone for QuantScheme
impl Clone for QuantScheme
Source§fn clone(&self) -> QuantScheme
fn clone(&self) -> QuantScheme
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for QuantScheme
impl Debug for QuantScheme
Source§impl<'de> Deserialize<'de> for QuantScheme
impl<'de> Deserialize<'de> for QuantScheme
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Source§impl PartialEq for QuantScheme
impl PartialEq for QuantScheme
Source§impl Serialize for QuantScheme
impl Serialize for QuantScheme
impl Copy for QuantScheme
impl Eq for QuantScheme
impl StructuralPartialEq for QuantScheme
Auto Trait Implementations§
impl Freeze for QuantScheme
impl RefUnwindSafe for QuantScheme
impl Send for QuantScheme
impl Sync for QuantScheme
impl Unpin for QuantScheme
impl UnwindSafe for QuantScheme
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more