Expand description
LRU-K adjacency cache for graph traversal optimization.
This module provides a traversal-aware caching system designed specifically for graph workloads. It uses an LRU-K eviction policy with K=2, which distinguishes between sequential access patterns (graph traversals) and random access (node lookups).
§Cache Design
§LRU-K Eviction (K=2)
Traditional LRU evicts based on the most recent access, which can lead to poor cache behavior for graph traversals where nodes are visited in predictable patterns. LRU-K tracks the K most recent accesses and uses the K-th most recent access (correlated reference) for eviction decisions:
- K=1: Traditional LRU (evicts least recently used)
- K=2: Correlated reference frequency (ideal for traversals)
- K>2: Too much overhead for marginal benefit
For K=2, nodes visited repeatedly in traversals get 2nd-last access timestamps that prevent eviction, while one-off lookups get evicted quickly.
§Traversal Score Tracking
The cache tracks traversal patterns by computing a score for each node:
- Sequential hits (node visited again in same traversal) → higher score
- Random access (isolated lookups) → lower score
- High-degree nodes protected from eviction (cache pinning)
This prevents cache pollution from random queries while preserving traversal working sets.
§Cache Invalidation Policies
§Insert-Based Invalidation
When edges are inserted, affected adjacency lists are automatically invalidated:
graph.insert_edge(node1, "CONNECTS", node2, vec![])?;
// Cache for node1 and node2 is automatically invalidatedThis ensures read-after-write consistency - you never see stale edges after inserting new ones.
§Manual Invalidation
For advanced use cases, manual cache control is available:
use sqlitegraph::cache::AdjacencyCache;
let cache = AdjacencyCache::new();
// Clear all cache entries
cache.clear();
// Remove specific node from cache
cache.remove(node_id);§Performance Characteristics
§Cache Hit Ratio
For BFS/DFS traversal workloads:
- Expected hit ratio: 95%+ for depth-first traversals
- BFS hit ratio: 80-95% depending on branching factor
- Random access: 20-40% (limited benefit for pure lookups)
§Memory Overhead
- Per-entry: O(degree) for stored adjacency lists
- Metadata: ~32 bytes per entry (timestamps, scores)
- Total overhead: ~10-20% additional memory vs uncached
§Latency Impact
- Cache hit: ~100ns (hash lookup + clone)
- Cache miss: ~10-100μs (SQLite query + cache insert)
- Speedup: 100-1000x for cached accesses
§When to Use Cache
§Good Cache Workloads
- Graph traversals: BFS, DFS, k-hop queries
- Shortest path: Repeated neighbor access
- PageRank: Iterative neighbor iteration
- Community detection: Repeated local queries
§Poor Cache Workloads
- Random node lookups: No repeated access pattern
- Write-heavy workloads: Frequent invalidation
- One-shot queries: No benefit for single accesses
- Large graphs with low locality: Cache thrashing
§Usage Examples
The cache is automatically enabled in SqliteGraph for traversal operations:
use sqlitegraph::{SqliteGraph, algo::connected_components};
let graph = SqliteGraph::open("my_graph.db")?;
// BFS traversal uses cache automatically
let neighbors = graph.fetch_outgoing(node_id)?;
// Algorithm with repeated traversal benefits from cache
let components = connected_components(&graph)?;For custom caching, use AdjacencyCache directly:
use sqlitegraph::cache::AdjacencyCache;
let cache = AdjacencyCache::new();
// Cache miss - loads from database
let neighbors = cache.get(node_id)
.unwrap_or_else(|| load_neighbors_from_db(node_id));
// Cache hit - returns stored neighbors
let neighbors = cache.get(node_id)?;