pub struct KVCacheConfig {
pub max_seq_len: usize,
pub num_heads: usize,
pub head_dim: usize,
pub quantization_bits: u8,
pub eviction_policy: EvictionPolicy,
}Expand description
Configuration for the quantized KV-cache.
Fields§
§max_seq_len: usizeMaximum sequence length the cache can hold before eviction is required.
num_heads: usizeNumber of attention heads.
head_dim: usizeDimension per attention head.
quantization_bits: u8Bit-width for quantization. Supported: 2, 3, 4, 8.
eviction_policy: EvictionPolicyPolicy used when the cache exceeds its budget.
Trait Implementations§
Source§impl Clone for KVCacheConfig
impl Clone for KVCacheConfig
Source§fn clone(&self) -> KVCacheConfig
fn clone(&self) -> KVCacheConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for KVCacheConfig
impl RefUnwindSafe for KVCacheConfig
impl Send for KVCacheConfig
impl Sync for KVCacheConfig
impl Unpin for KVCacheConfig
impl UnsafeUnpin for KVCacheConfig
impl UnwindSafe for KVCacheConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more