pub struct EmbeddingCache { /* private fields */ }Expand description
Unstable: internal LRU caching mechanism; shard count and eviction policy may change.
Embedding cache with sharded LRU eviction policy.
Thread-safe cache for storing computed embeddings. Uses Blake3 hashing for fast, collision-resistant cache keys. Internally sharded into 16 independent LRU caches to reduce write-lock contention.
§Disabling
Pass capacity=0 to disable caching. All cache operations become no-ops
(no locking, no hashing work beyond key construction).
§Example
use lattice_embed::{EmbeddingCache, EmbeddingModel, ModelConfig};
use lattice_embed::service::EmbeddingRole;
let cache = EmbeddingCache::new(1000);
// Cache miss - no embedding stored yet
let key = cache.compute_key(
"Hello, world!",
ModelConfig::new(EmbeddingModel::BgeSmallEnV15),
EmbeddingRole::Generic,
);
assert!(cache.get(&key).is_none());
// Store embedding
let embedding = vec![0.1, 0.2, 0.3];
cache.put(key, embedding.clone());
// Cache hit — returns Arc<[f32]>
let cached = cache.get(&key).unwrap();
assert_eq!(&*cached, &embedding[..]);Implementations§
Source§impl EmbeddingCache
impl EmbeddingCache
Sourcepub fn new(capacity: usize) -> Self
pub fn new(capacity: usize) -> Self
Unstable: constructor signature may change when shard count becomes configurable.
The capacity is divided equally across 16 internal shards. Each shard independently manages its own LRU eviction.
§Arguments
capacity- Maximum total number of embeddings to cache. Use 0 to disable caching.
Sourcepub fn with_default_capacity() -> Self
pub fn with_default_capacity() -> Self
Unstable: convenience constructor; subject to change with cache redesign.
Sourcepub fn compute_key(
&self,
text: &str,
model_config: ModelConfig,
role: EmbeddingRole,
) -> [u8; 32]
pub fn compute_key( &self, text: &str, model_config: ModelConfig, role: EmbeddingRole, ) -> [u8; 32]
Unstable: key scheme (Blake3 + EmbeddingKey canonical bytes) may change; don’t store keys across sessions.
Uses Blake3 hashing for fast, collision-resistant keys. The key includes the model
name, revision, and active dimension from the ModelConfig, so different MRL truncations
produce different cache keys.
The role is also included so that embed_query("hello") and embed_passage("hello")
produce different cache entries even when the raw text and model config are identical.
Use EmbeddingRole::Generic for the backwards-compatible embed() path.
Sourcepub fn get(&self, key: &[u8; 32]) -> Option<Arc<[f32]>>
pub fn get(&self, key: &[u8; 32]) -> Option<Arc<[f32]>>
Unstable: return type (Arc<[f32]>) may change to a newtype; internal cache API.
Returns Some(Arc<[f32]>) if found (cheap refcount bump), None otherwise.
Updates per-shard hit/miss counters for metrics.
Sourcepub fn put(&self, key: [u8; 32], embedding: Vec<f32>)
pub fn put(&self, key: [u8; 32], embedding: Vec<f32>)
Unstable: internal cache storage method; interface may change.
Converts the Vec into Arc<[f32]> for shared-ownership storage.
If the shard is at capacity, its least recently used entry is evicted.
Sourcepub fn get_many(&self, keys: &[[u8; 32]]) -> Vec<Option<Arc<[f32]>>>
pub fn get_many(&self, keys: &[[u8; 32]]) -> Vec<Option<Arc<[f32]>>>
Unstable: batch cache access; return type may change with cache redesign.
Returns a vector of Option<Arc<[f32]>> for each key, in the same order.
Each hit is an O(1) refcount bump (no data copy).
Sourcepub fn put_many(&self, entries: Vec<([u8; 32], Vec<f32>)>)
pub fn put_many(&self, entries: Vec<([u8; 32], Vec<f32>)>)
Unstable: batch cache storage; interface may change with cache redesign.
Converts each Vec into Arc<[f32]> for shared-ownership storage.
Sourcepub fn stats(&self) -> CacheStats
pub fn stats(&self) -> CacheStats
Unstable: returns CacheStats which is itself Unstable; metrics shape may evolve.
Aggregates per-shard counters. The size field is the sum of all shard sizes.
Sourcepub fn per_shard_stats(&self) -> Vec<ShardStats>
pub fn per_shard_stats(&self) -> Vec<ShardStats>
Unstable: internal monitoring hook; shard count and ShardStats shape may change.
Returns a vector of (size, hits, misses) tuples, one per shard.
Sourcepub fn clear(&self)
pub fn clear(&self)
Unstable: internal cache management; may be removed in favor of capacity-based eviction.
Sourcepub fn is_enabled(&self) -> bool
pub fn is_enabled(&self) -> bool
Unstable: internal state query; may be removed when zero-capacity is the only disable path.