pub struct KVCacheConfig {
pub dim: usize,
pub key_bits: u8,
pub value_bits: u8,
pub key_strategy: QuantStrategy,
pub seed: u64,
pub max_tokens: usize,
}Expand description
Configuration for a quantized KV cache.
Fields§
§dim: usizeVector dimension (must match model’s head dimension).
key_bits: u8Bits per coordinate for key quantization.
value_bits: u8Bits per coordinate for value quantization.
key_strategy: QuantStrategyQuantization strategy for keys.
seed: u64Random seed for quantizer initialization.
max_tokens: usizeMaximum number of tokens. When exceeded, oldest tokens are evicted (sliding window). 0 means unlimited.
Implementations§
Source§impl KVCacheConfig
impl KVCacheConfig
Sourcepub fn new(dim: usize) -> Self
pub fn new(dim: usize) -> Self
Create a default config for a given dimension.
Defaults: 4-bit keys with Prod strategy, 4-bit values with MSE strategy.
Sourcepub fn with_key_bits(self, bits: u8) -> Self
pub fn with_key_bits(self, bits: u8) -> Self
Set key bit width.
Sourcepub fn with_value_bits(self, bits: u8) -> Self
pub fn with_value_bits(self, bits: u8) -> Self
Set value bit width.
Sourcepub fn with_key_strategy(self, strategy: QuantStrategy) -> Self
pub fn with_key_strategy(self, strategy: QuantStrategy) -> Self
Set key quantization strategy.
Sourcepub fn with_max_tokens(self, max_tokens: usize) -> Self
pub fn with_max_tokens(self, max_tokens: usize) -> Self
Set maximum token capacity (0 = unlimited). When the cache exceeds this limit, oldest tokens are evicted.
Trait Implementations§
Source§impl Clone for KVCacheConfig
impl Clone for KVCacheConfig
Source§fn clone(&self) -> KVCacheConfig
fn clone(&self) -> KVCacheConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for KVCacheConfig
impl Debug for KVCacheConfig
Source§impl<'de> Deserialize<'de> for KVCacheConfig
impl<'de> Deserialize<'de> for KVCacheConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for KVCacheConfig
impl RefUnwindSafe for KVCacheConfig
impl Send for KVCacheConfig
impl Sync for KVCacheConfig
impl Unpin for KVCacheConfig
impl UnsafeUnpin for KVCacheConfig
impl UnwindSafe for KVCacheConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self to the equivalent element of its superset.