Crate tensor_cache

Expand description

Tensor-native LLM response cache - Module 10 of Neumann

Tensor-native semantic caching for LLM responses with exact and semantic matching, cost tracking, and background eviction.

§Architecture

Uses TensorStore as its backing store, aligning with the tensor-native paradigm used by tensor_vault and tensor_blob. Cache entries are stored as TensorData with standardized field prefixes.

§Cache Layers

Exact Cache: O(1) hash-based lookup for identical queries
Semantic Cache: O(log n) HNSW-based similarity search
Embedding Cache: O(1) cached embeddings for queries

§Example

use tensor_cache::{Cache, CacheConfig};

// Configure cache with 3-dimensional embeddings
let mut config = CacheConfig::default();
config.embedding_dim = 3;
let cache = Cache::with_config(config).unwrap();

// Store a response
let embedding = vec![0.1, 0.2, 0.3];
cache
    .put("What is 2+2?", &embedding, "4", "gpt-4", None)
    .unwrap();

// Look up (tries exact first, then semantic)
if let Some(hit) = cache.get("What is 2+2?", Some(&embedding)) {
    println!("Cached: {}", hit.response);
}

Structs§

Cache: LLM response cache with tensor-native storage.
CacheConfig: Cache configuration with capacity limits, TTL, and eviction settings.
CacheHit: Result of a successful cache lookup.
CacheStats: Thread-safe cache statistics with atomic counters.
EvictionManager: Manages background eviction for cache maintenance.
EvictionScorer: Calculates eviction priority scores. Lower scores are evicted first.
ModelPricing: Model pricing (cost per 1000 tokens in dollars).
SparseVector: A vector that only stores non-zero values.
StatsSnapshot: Point-in-time snapshot of cache statistics for reporting.
TokenCounter: Token counter using tiktoken’s cl100k_base encoding (GPT-4, GPT-3.5-turbo, ada-002). Falls back to character-based estimation (~4 chars per token) if tiktoken unavailable.

Enums§

CacheError
CacheLayer
DistanceMetric: Distance metric for vector similarity/distance computation.
EvictionStrategy: Eviction strategy for cache entries.

Type Aliases§

Result

Crate tensor_cache

Crate tensor_cache Copy item path

§Architecture

§Cache Layers

§Example

Structs§

Enums§

Type Aliases§

Crate tensor_cache