#[non_exhaustive]pub struct KvCacheParams {
pub type_k: KvCacheType,
pub type_v: KvCacheType,
}Expand description
KV cache quantization configuration.
Controls the data type used for the attention K and V caches. llama.cpp defaults
both to F16 (GGML_TYPE_F16), which is what KvCacheParams::default() preserves.
Quantizing the KV cache (e.g. Q8_0 → ~½ size, Q4_0 → ~¼ size) trades a small
amount of accuracy for a large reduction in VRAM usage, which is often the dominant
cost at long n_ctx.
Marked #[non_exhaustive]; build via Default::default() and chain the
with_* setters:
use rig_llama_cpp::{KvCacheParams, KvCacheType};
let kv = KvCacheParams::default()
.with_type_k(KvCacheType::Q8_0)
.with_type_v(KvCacheType::Q8_0);Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Non-exhaustive structs could have additional fields added in future. Therefore, non-exhaustive structs cannot be constructed in external crates using the traditional
Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.type_k: KvCacheTypeData type for the K cache (default: KvCacheType::F16).
type_v: KvCacheTypeData type for the V cache (default: KvCacheType::F16).
Implementations§
Source§impl KvCacheParams
impl KvCacheParams
Sourcepub fn with_type_k(self, type_k: KvCacheType) -> Self
pub fn with_type_k(self, type_k: KvCacheType) -> Self
Override the K cache data type.
Sourcepub fn with_type_v(self, type_v: KvCacheType) -> Self
pub fn with_type_v(self, type_v: KvCacheType) -> Self
Override the V cache data type.
Trait Implementations§
Source§impl Clone for KvCacheParams
impl Clone for KvCacheParams
Source§fn clone(&self) -> KvCacheParams
fn clone(&self) -> KvCacheParams
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for KvCacheParams
impl Debug for KvCacheParams
Source§impl Default for KvCacheParams
impl Default for KvCacheParams
impl Copy for KvCacheParams
Auto Trait Implementations§
impl Freeze for KvCacheParams
impl RefUnwindSafe for KvCacheParams
impl Send for KvCacheParams
impl Sync for KvCacheParams
impl Unpin for KvCacheParams
impl UnsafeUnpin for KvCacheParams
impl UnwindSafe for KvCacheParams
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more