pub struct ResponseCache { /* private fields */ }Expand description
LRU cache for LLM completion responses.
Thread-safe via parking_lot::Mutex (never held across .await).
Entries are keyed by FNV-1a hash of (system_prompt, messages, sorted tool names).
Uses a Vec with move-to-front on hit and eviction from back, giving O(n)
operations per access. This is efficient for typical capacities (10–100).
For very large caches (1000+), consider an alternative implementation.
parking_lot::Mutex is adopted on this hot path (every cached LLM call) for
~2× faster acquisition vs. std::sync::Mutex; see T2 in
tasks/performance-audit-heartbit-core-2026-05-06.md.
Implementations§
Source§impl ResponseCache
impl ResponseCache
Sourcepub fn new(capacity: usize) -> Self
pub fn new(capacity: usize) -> Self
Create a new cache with the given maximum number of entries.
Sourcepub fn get(&self, key: u64) -> Option<CompletionResponse>
pub fn get(&self, key: u64) -> Option<CompletionResponse>
Look up a cached response by key. On hit, moves the entry to the front (LRU).
Sourcepub fn put(&self, key: u64, response: CompletionResponse)
pub fn put(&self, key: u64, response: CompletionResponse)
Insert a response into the cache. Evicts the least-recently-used entry if at capacity.
Sourcepub fn compute_key(
system_prompt: &str,
messages: &[Message],
tool_names: &[&str],
) -> u64
pub fn compute_key( system_prompt: &str, messages: &[Message], tool_names: &[&str], ) -> u64
Compute a cache key from the request components.
Uses FNV-1a hash of system prompt, serialized messages, and sorted tool names.
Backward-compatible with single-tenant code: prefer
ResponseCache::compute_key_scoped when the runner is shared across
tenants/users (F-AGENT-3).
Sourcepub fn compute_key_scoped(
system_prompt: &str,
messages: &[Message],
tool_names: &[&str],
namespace: Option<&str>,
) -> u64
pub fn compute_key_scoped( system_prompt: &str, messages: &[Message], tool_names: &[&str], namespace: Option<&str>, ) -> u64
Compute a cache key including a tenant/user namespace.
SECURITY (F-AGENT-3): when a single AgentRunner is shared across
tenants (typical daemon deployment), the cache key MUST disambiguate
otherwise-identical requests — otherwise tenant A’s cached response
could be served to tenant B if their system_prompt + messages happened
to coincide. Pass Some("{tenant_id}:{user_id}") (or any unique
namespace string) to scope the cache.