pub struct LlamaKVCacheInterface { /* private fields */ }Expand description
Interface for interacting with llama.cpp’s KV cache via the llama-server HTTP API.
Construct with LlamaKVCacheInterface::with_backend(url) when the server URL is known.
The default constructor creates an instance with no URL, which falls back to safe defaults
for all read operations and no-ops for mutating operations.
Implementations§
Source§impl LlamaKVCacheInterface
impl LlamaKVCacheInterface
pub fn new() -> Self
Sourcepub fn with_backend(backend_url: String) -> Self
pub fn with_backend(backend_url: String) -> Self
Create an interface pre-wired to a running llama-server.
Sourcepub async fn get_current_cache_state(&self) -> Result<LlamaKVCacheState>
pub async fn get_current_cache_state(&self) -> Result<LlamaKVCacheState>
Get current KV cache state from llama-server.
Queries GET /slots and returns data for slot 0 (the default interactive
slot). Falls back to a zero-filled state when the server is unreachable.
Sourcepub async fn extract_current_kv_entries(&self) -> Result<Vec<KVEntry>>
pub async fn extract_current_kv_entries(&self) -> Result<Vec<KVEntry>>
Extract current KV cache metadata from llama-server.
The llama-server HTTP API does not expose raw tensor data; this method
constructs representative KVEntry descriptors from slot state data so
the rest of the cache management pipeline has something to work with.
Importance scores are derived from token position — early tokens (system
prompt, opening context) score higher than later ones.
Sourcepub async fn inject_kv_entries(&self, entries: &[KVEntry]) -> Result<()>
pub async fn inject_kv_entries(&self, entries: &[KVEntry]) -> Result<()>
Restore the KV cache for slot 0 from a previously saved file.
Uses llama-server’s POST /slots/0 with {"action": "restore", "filename": "..."}.
Sourcepub async fn clear_cache_entries(
&self,
layer_indices: &[i32],
_head_indices: &[Option<i32>],
) -> Result<()>
pub async fn clear_cache_entries( &self, layer_indices: &[i32], _head_indices: &[Option<i32>], ) -> Result<()>
Erase the KV cache for slot 0 via POST /slots/0 with {"action": "erase"}.
layer_indices and head_indices are accepted for API compatibility but the
llama-server erase action clears the entire slot (partial-layer erase is not
supported via HTTP).
Sourcepub async fn get_cache_memory_usage(&self) -> Result<usize>
pub async fn get_cache_memory_usage(&self) -> Result<usize>
Estimate memory used by the KV cache from slot state.
Uses the same heuristic as get_current_cache_state: 32 KB per token.
Sourcepub async fn estimate_cache_capacity(&self) -> Result<f32>
pub async fn estimate_cache_capacity(&self) -> Result<f32>
Estimate what fraction of the context window is filled (0.0–1.0).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for LlamaKVCacheInterface
impl !RefUnwindSafe for LlamaKVCacheInterface
impl Send for LlamaKVCacheInterface
impl Sync for LlamaKVCacheInterface
impl Unpin for LlamaKVCacheInterface
impl UnsafeUnpin for LlamaKVCacheInterface
impl !UnwindSafe for LlamaKVCacheInterface
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more