pub enum KvCacheDtype {
Fp16,
Bf16,
Int8,
Fp8,
}Expand description
KV Cache element dtype (Dim 5 polymorphism point).
Mirrors ferrum_interfaces::kv_dtype::KvDtypeKind markers but
lives here because KvCacheConfig is part of the user-facing
EngineConfig and needs Serialize / Deserialize.
Variants§
Fp16
FP16 K/V — the validated production path on every backend.
Bf16
BF16 K/V — same memory cost as FP16, slightly different precision. Marker only; no backend impl ships yet.
Int8
INT8 K/V with per-token per-kv-head FP16 scale (vLLM-style).
Halves KV memory at small (<1%) accuracy hit. CUDA kernels
land via BackendKvDtype<KvInt8> (PR #131); model wire-up
(KvCacheQuant<B, KvInt8> through the model decode loop) is
the only remaining step.
Fp8
FP8 (E4M3) K/V. Marker only; CUDA kernels pending.
Implementations§
Trait Implementations§
Source§impl Clone for KvCacheDtype
impl Clone for KvCacheDtype
Source§fn clone(&self) -> KvCacheDtype
fn clone(&self) -> KvCacheDtype
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreimpl Copy for KvCacheDtype
Source§impl Debug for KvCacheDtype
impl Debug for KvCacheDtype
Source§impl Default for KvCacheDtype
impl Default for KvCacheDtype
Source§fn default() -> KvCacheDtype
fn default() -> KvCacheDtype
Returns the “default value” for a type. Read more
Source§impl<'de> Deserialize<'de> for KvCacheDtype
impl<'de> Deserialize<'de> for KvCacheDtype
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
impl Eq for KvCacheDtype
Source§impl PartialEq for KvCacheDtype
impl PartialEq for KvCacheDtype
Source§fn eq(&self, other: &KvCacheDtype) -> bool
fn eq(&self, other: &KvCacheDtype) -> bool
Tests for
self and other values to be equal, and is used by ==.Source§impl Serialize for KvCacheDtype
impl Serialize for KvCacheDtype
impl StructuralPartialEq for KvCacheDtype
Auto Trait Implementations§
impl Freeze for KvCacheDtype
impl RefUnwindSafe for KvCacheDtype
impl Send for KvCacheDtype
impl Sync for KvCacheDtype
impl Unpin for KvCacheDtype
impl UnsafeUnpin for KvCacheDtype
impl UnwindSafe for KvCacheDtype
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more