cache_rs/gdsf.rs
1//! Greedy Dual-Size Frequency (GDSF) Cache Implementation
2//!
3//! GDSF is a sophisticated cache replacement algorithm designed for **variable-sized objects**.
4//! It combines object size, access frequency, and aging into a unified priority formula,
5//! making it ideal for CDN caches, image caches, and any scenario where cached objects
6//! have different sizes.
7//!
8//! # How the Algorithm Works
9//!
10//! GDSF calculates priority for each cached item using the formula:
11//!
12//! ```text
13//! Priority = (Frequency / Size) + GlobalAge
14//! ```
15//!
16//! This formula cleverly balances multiple factors:
17//! - **Frequency**: More frequently accessed items get higher priority
18//! - **Size**: Smaller items are favored (more items fit in cache)
19//! - **Aging**: Prevents old popular items from staying forever
20//!
21//! ## Mathematical Formulation
22//!
23//! ```text
24//! For each cache entry i:
25//! - F_i = access frequency of item i
26//! - S_i = size of item i (in bytes)
27//! - GlobalAge = increases on eviction (set to evicted item's priority)
28//! - Priority_i = (F_i / S_i) + GlobalAge
29//!
30//! On eviction: select item j where Priority_j = min{Priority_i for all i}
31//! ```
32//!
33//! ## Why Size Matters
34//!
35//! Consider a cache with 10KB capacity:
36//! - Option A: Cache one 10KB file accessed 10 times → Priority = 10/10000 = 0.001
37//! - Option B: Cache ten 1KB files accessed once each → Each Priority = 1/1000 = 0.001
38//!
39//! GDSF recognizes that caching many small frequently-accessed items often yields
40//! better hit rates than caching fewer large items.
41//!
42//! ## Data Structure
43//!
44//! ```text
45//! ┌─────────────────────────────────────────────────────────────────────────────┐
46//! │ GDSF Cache (global_age=1.5, max_size=10MB) │
47//! │ │
48//! │ HashMap<K, *Node> BTreeMap<Priority, List> │
49//! │ ┌──────────────┐ ┌─────────────────────────────────────────┐ │
50//! │ │ "icon.png" ─────────────────│ pri=3.5: [icon.png] (f=2, s=1KB) │ │
51//! │ │ "thumb.jpg" ────────────────│ pri=2.5: [thumb.jpg] (f=1, s=1KB) │ │
52//! │ │ "video.mp4" ────────────────│ pri=1.5: [video.mp4] (f=10, s=100MB)←E │ │
53//! │ └──────────────┘ └─────────────────────────────────────────┘ │
54//! │ ▲ │
55//! │ │ │
56//! │ min_priority=1.5 │
57//! │ │
58//! │ Note: video.mp4 has high frequency (10) but large size (100MB), │
59//! │ so its priority = 10/100000000 + 1.5 ≈ 1.5 (eviction candidate) │
60//! └─────────────────────────────────────────────────────────────────────────────┘
61//! ```
62//!
63//! ## Operations
64//!
65//! | Operation | Action | Time |
66//! |-----------|--------|------|
67//! | `get(key)` | Increment frequency, recalculate priority | O(log P) |
68//! | `put(key, value, size)` | Insert with priority=(1/size)+age | O(log P) |
69//! | `remove(key)` | Remove from priority list, update min_priority | O(log P) |
70//!
71//! ## Size-Aware Example
72//!
73//! ```text
74//! Cache: max_size=5KB, global_age=0
75//!
76//! put("small.txt", data, 1KB) → pri=1.0, total_size=1KB
77//! put("medium.txt", data, 2KB) → pri=0.5, total_size=3KB
78//! put("large.txt", data, 3KB) → pri=0.33, total_size=6KB OVERFLOW!
79//!
80//! Eviction needed: evict "large.txt" (lowest priority=0.33)
81//! global_age = 0.33
82//!
83//! put("large.txt", data, 3KB) → pri=0.33+0.33=0.66, total_size=6KB OVERFLOW!
84//!
85//! Evict again: now "medium.txt" has lowest priority (0.5 < 0.66)
86//! Result: small.txt + large.txt fit in 4KB
87//! ```
88//!
89//! # Dual-Limit Capacity
90//!
91//! GDSF naturally works with size-based limits:
92//!
93//! - **`max_entries`**: Maximum number of items (prevents too many tiny items)
94//! - **`max_size`**: Maximum total size (primary constraint for GDSF)
95//!
96//! # Performance Characteristics
97//!
98//! | Metric | Value |
99//! |--------|-------|
100//! | Get | O(log P) |
101//! | Put | O(log P) amortized |
102//! | Remove | O(log P) |
103//! | Memory per entry | ~120 bytes overhead + key×2 + value |
104//!
105//! Where P = number of distinct priority buckets. Priority = (frequency/size) + age,
106//! quantized to integer keys. BTreeMap provides O(log P) lookups.
107//!
108//! Higher overhead than simpler algorithms due to priority calculation and
109//! BTreeMap-based priority lists.
110//!
111//! # When to Use GDSF
112//!
113//! **Good for:**
114//! - CDN and proxy caches with variable object sizes
115//! - Image thumbnail caches
116//! - API response caches with varying payload sizes
117//! - File system caches
118//! - Any size-constrained cache with heterogeneous objects
119//!
120//! **Not ideal for:**
121//! - Uniform-size objects (simpler algorithms work equally well)
122//! - Entry-count-constrained caches (LRU/LFU are simpler)
123//! - Very small caches (overhead not justified)
124//!
125//! # GDSF vs Other Algorithms
126//!
127//! | Aspect | LRU | LFU | GDSF |
128//! |--------|-----|-----|------|
129//! | Size-aware | No | No | **Yes** |
130//! | Frequency-aware | No | Yes | Yes |
131//! | Aging | Implicit | No | Yes |
132//! | Best for | Recency | Frequency | Variable-size objects |
133//!
134//! # Thread Safety
135//!
136//! `GdsfCache` is **not thread-safe**. For concurrent access, either:
137//! - Wrap with `Mutex` or `RwLock`
138//! - Use `ConcurrentGdsfCache` (requires `concurrent` feature)
139//!
140//! # Examples
141//!
142//! ## Basic Usage
143//!
144//! ```
145//! use cache_rs::GdsfCache;
146//! use cache_rs::config::GdsfCacheConfig;
147//! use core::num::NonZeroUsize;
148//!
149//! // Create cache with max 1000 entries and 10MB size limit
150//! let config = GdsfCacheConfig {
151//! capacity: NonZeroUsize::new(1000).unwrap(),
152//! initial_age: 0.0,
153//! max_size: 10 * 1024 * 1024, // 10MB
154//! };
155//! let mut cache: GdsfCache<String, Vec<u8>> = GdsfCache::init(config, None);
156//!
157//! // Insert with explicit size tracking
158//! let small_data = vec![0u8; 1024]; // 1KB
159//! cache.put("small.txt".to_string(), small_data, 1024);
160//!
161//! let large_data = vec![0u8; 1024 * 1024]; // 1MB
162//! cache.put("large.bin".to_string(), large_data, 1024 * 1024);
163//!
164//! // Small items get higher priority per byte
165//! assert!(cache.get(&"small.txt".to_string()).is_some());
166//! ```
167//!
168//! ## CDN-Style Caching
169//!
170//! ```
171//! use cache_rs::GdsfCache;
172//! use cache_rs::config::GdsfCacheConfig;
173//! use core::num::NonZeroUsize;
174//!
175//! // 100MB cache for web assets
176//! let config = GdsfCacheConfig {
177//! capacity: NonZeroUsize::new(10000).unwrap(),
178//! initial_age: 0.0,
179//! max_size: 100 * 1024 * 1024,
180//! };
181//! let mut cache: GdsfCache<String, Vec<u8>> = GdsfCache::init(config, None);
182//!
183//! // Cache various asset types with their sizes
184//! fn cache_asset(cache: &mut GdsfCache<String, Vec<u8>>, url: &str, data: Vec<u8>) {
185//! let size = data.len() as u64;
186//! cache.put(url.to_string(), data, size);
187//! }
188//!
189//! // Small, frequently-accessed assets get priority over large, rarely-used ones
190//! ```
191
192extern crate alloc;
193
194use crate::config::GdsfCacheConfig;
195use crate::entry::CacheEntry;
196use crate::list::{List, ListEntry};
197use crate::metrics::{CacheMetrics, GdsfCacheMetrics};
198
199/// Metadata for GDSF (Greedy Dual-Size Frequency) cache entries.
200///
201/// GDSF is a sophisticated algorithm that considers:
202/// - **Frequency**: How often the item is accessed
203/// - **Size**: Larger items have lower priority per byte
204/// - **Aging**: Global clock advances when items are evicted
205///
206/// # Priority Calculation
207///
208/// ```text
209/// priority = (frequency / size) + global_age
210/// ```
211///
212/// Items with lower priority are evicted first.
213///
214/// # Examples
215///
216/// ```
217/// use cache_rs::gdsf::GdsfMeta;
218///
219/// let meta = GdsfMeta::new(1, 0.5); // frequency=1, priority=0.5
220/// assert_eq!(meta.frequency, 1);
221/// assert_eq!(meta.priority, 0.5);
222/// ```
223#[derive(Debug, Clone, Copy, Default, PartialEq)]
224pub struct GdsfMeta {
225 /// Access frequency count.
226 pub frequency: u64,
227
228 /// Calculated priority: (frequency / size) + clock.
229 /// Lower priority = more likely to be evicted.
230 pub priority: f64,
231}
232
233impl GdsfMeta {
234 /// Creates a new GDSF metadata with the specified frequency and priority.
235 ///
236 /// # Arguments
237 ///
238 /// * `frequency` - Initial frequency count
239 /// * `priority` - Calculated priority value
240 #[inline]
241 pub fn new(frequency: u64, priority: f64) -> Self {
242 Self {
243 frequency,
244 priority,
245 }
246 }
247
248 /// Increments the frequency counter and returns the new value.
249 #[inline]
250 pub fn increment(&mut self) -> u64 {
251 self.frequency += 1;
252 self.frequency
253 }
254
255 /// Calculates and updates the priority based on frequency, size, and global age.
256 ///
257 /// # Arguments
258 ///
259 /// * `size` - Size of the cached item
260 /// * `global_age` - Current global age (clock value) of the cache
261 ///
262 /// # Returns
263 ///
264 /// The newly calculated priority value.
265 #[inline]
266 pub fn calculate_priority(&mut self, size: u64, global_age: f64) -> f64 {
267 self.priority = if size == 0 {
268 f64::INFINITY
269 } else {
270 (self.frequency as f64 / size as f64) + global_age
271 };
272 self.priority
273 }
274}
275use alloc::boxed::Box;
276use alloc::collections::BTreeMap;
277use alloc::string::String;
278use alloc::vec::Vec;
279use core::borrow::Borrow;
280use core::hash::{BuildHasher, Hash};
281use core::num::NonZeroUsize;
282
283#[cfg(feature = "hashbrown")]
284use hashbrown::DefaultHashBuilder;
285#[cfg(feature = "hashbrown")]
286use hashbrown::HashMap;
287
288#[cfg(not(feature = "hashbrown"))]
289use std::collections::hash_map::RandomState as DefaultHashBuilder;
290#[cfg(not(feature = "hashbrown"))]
291use std::collections::HashMap;
292
293/// Internal GDSF segment containing the actual cache algorithm.
294///
295/// Uses `CacheEntry<K, V, GdsfMeta>` as the unified entry type. The map stores
296/// raw pointers to list nodes, and all entry data (key, value, size, metadata)
297/// is stored in the `CacheEntry`.
298pub(crate) struct GdsfSegment<K, V, S = DefaultHashBuilder> {
299 config: GdsfCacheConfig,
300 global_age: f64,
301 min_priority: f64,
302 /// Maps keys to node pointers. The node contains CacheEntry with all data.
303 map: HashMap<K, *mut ListEntry<CacheEntry<K, V, GdsfMeta>>, S>,
304 /// Priority lists: key is (priority * 1000) as u64 for BTreeMap ordering
305 priority_lists: BTreeMap<u64, List<CacheEntry<K, V, GdsfMeta>>>,
306 metrics: GdsfCacheMetrics,
307 /// Current total size of cached content (sum of entry sizes)
308 current_size: u64,
309}
310
311// SAFETY: GdsfSegment owns all data and raw pointers point only to nodes owned by
312// `priority_lists`. Concurrent access is safe when wrapped in proper synchronization primitives.
313unsafe impl<K: Send, V: Send, S: Send> Send for GdsfSegment<K, V, S> {}
314
315// SAFETY: All mutation requires &mut self; shared references cannot cause data races.
316unsafe impl<K: Send, V: Send, S: Sync> Sync for GdsfSegment<K, V, S> {}
317
318impl<K: Hash + Eq, V: Clone, S: BuildHasher> GdsfSegment<K, V, S> {
319 /// Creates a new GDSF segment from a configuration.
320 ///
321 /// This is the **only** way to create a GDSF segment. All configuration
322 /// is specified through the [`GdsfCacheConfig`] struct.
323 ///
324 /// # Arguments
325 ///
326 /// * `config` - Configuration specifying capacity, initial age, and optional size limit
327 /// * `hasher` - Hash builder for the internal HashMap
328 #[allow(dead_code)] // Used by concurrent module when feature is enabled
329 pub(crate) fn init(config: GdsfCacheConfig, hasher: S) -> Self {
330 let map_capacity = config.capacity.get().next_power_of_two();
331 GdsfSegment {
332 global_age: config.initial_age,
333 min_priority: 0.0,
334 map: HashMap::with_capacity_and_hasher(map_capacity, hasher),
335 priority_lists: BTreeMap::new(),
336 metrics: GdsfCacheMetrics::new(config.max_size),
337 current_size: 0,
338 config,
339 }
340 }
341
342 #[inline]
343 pub(crate) fn cap(&self) -> NonZeroUsize {
344 self.config.capacity
345 }
346
347 #[inline]
348 pub(crate) fn len(&self) -> usize {
349 self.map.len()
350 }
351
352 #[inline]
353 pub(crate) fn is_empty(&self) -> bool {
354 self.map.is_empty()
355 }
356
357 #[inline]
358 pub(crate) fn global_age(&self) -> f64 {
359 self.global_age
360 }
361
362 /// Returns the current total size of cached content.
363 #[inline]
364 pub(crate) fn current_size(&self) -> u64 {
365 self.current_size
366 }
367
368 /// Returns the maximum content size the cache can hold.
369 #[inline]
370 pub(crate) fn max_size(&self) -> u64 {
371 self.config.max_size
372 }
373
374 #[inline]
375 pub(crate) fn metrics(&self) -> &GdsfCacheMetrics {
376 &self.metrics
377 }
378
379 #[inline]
380 pub(crate) fn record_miss(&mut self, object_size: u64) {
381 self.metrics.core.record_miss(object_size);
382 }
383
384 fn calculate_priority(&self, frequency: u64, size: u64) -> f64 {
385 if size == 0 {
386 return f64::INFINITY;
387 }
388 (frequency as f64 / size as f64) + self.global_age
389 }
390
391 unsafe fn update_priority_by_node(
392 &mut self,
393 node: *mut ListEntry<CacheEntry<K, V, GdsfMeta>>,
394 ) -> *mut ListEntry<CacheEntry<K, V, GdsfMeta>>
395 where
396 K: Clone + Hash + Eq,
397 {
398 // SAFETY: node is guaranteed valid by caller's contract
399 let entry = (*node).get_value_mut();
400 let key_cloned = entry.key.clone();
401 let size = entry.metadata.size;
402 let meta = &mut entry.metadata.algorithm;
403 let old_priority = meta.priority;
404
405 meta.increment();
406
407 let global_age = self.global_age;
408 let new_priority = meta.calculate_priority(size, global_age);
409
410 let old_priority_key = (old_priority * 1000.0) as u64;
411 let new_priority_key = (new_priority * 1000.0) as u64;
412
413 if old_priority_key == new_priority_key {
414 self.priority_lists
415 .get_mut(&new_priority_key)
416 .unwrap()
417 .move_to_front(node);
418 return node;
419 }
420
421 let boxed_entry = self
422 .priority_lists
423 .get_mut(&old_priority_key)
424 .unwrap()
425 .remove(node)
426 .unwrap();
427
428 if self
429 .priority_lists
430 .get(&old_priority_key)
431 .unwrap()
432 .is_empty()
433 {
434 self.priority_lists.remove(&old_priority_key);
435 }
436
437 let entry_ptr = Box::into_raw(boxed_entry);
438
439 let capacity = self.config.capacity;
440 self.priority_lists
441 .entry(new_priority_key)
442 .or_insert_with(|| List::new(capacity));
443
444 self.priority_lists
445 .get_mut(&new_priority_key)
446 .unwrap()
447 .attach_from_other_list(entry_ptr);
448
449 // Update map with new node pointer
450 self.map.insert(key_cloned, entry_ptr);
451 entry_ptr
452 }
453
454 pub(crate) fn get<Q>(&mut self, key: &Q) -> Option<V>
455 where
456 K: Borrow<Q> + Clone,
457 Q: ?Sized + Hash + Eq,
458 {
459 if let Some(&node) = self.map.get(key) {
460 unsafe {
461 // SAFETY: node comes from our map
462 let entry = (*node).get_value();
463 let entry_size = entry.metadata.size;
464 let meta = &entry.metadata.algorithm;
465 self.metrics.core.record_hit(entry_size);
466 self.metrics
467 .record_item_access(meta.frequency, entry.metadata.size, meta.priority);
468
469 let new_node = self.update_priority_by_node(node);
470 let value = (*new_node).get_value().value.clone();
471 Some(value)
472 }
473 } else {
474 None
475 }
476 }
477
478 pub(crate) fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V>
479 where
480 K: Borrow<Q> + Clone,
481 Q: ?Sized + Hash + Eq,
482 {
483 if let Some(&node) = self.map.get(key) {
484 unsafe {
485 // SAFETY: node comes from our map
486 let entry = (*node).get_value();
487 let entry_size = entry.metadata.size;
488 let meta = &entry.metadata.algorithm;
489 self.metrics.core.record_hit(entry_size);
490 self.metrics
491 .record_item_access(meta.frequency, entry.metadata.size, meta.priority);
492
493 let new_node = self.update_priority_by_node(node);
494 let entry_mut = (*new_node).get_value_mut();
495 Some(&mut entry_mut.value)
496 }
497 } else {
498 None
499 }
500 }
501
502 /// Inserts a key-value pair with size tracking.
503 ///
504 /// Returns evicted entries, or `None` if no entries were evicted.
505 /// Note: Replacing an existing key does not return the old value.
506 pub(crate) fn put(&mut self, key: K, val: V, size: u64) -> Option<Vec<(K, V)>>
507 where
508 K: Clone,
509 {
510 if size == 0 {
511 return None;
512 }
513
514 // Check if key exists - update existing entry
515 if let Some(&node) = self.map.get(&key) {
516 unsafe {
517 // SAFETY: node comes from our map
518 let entry = (*node).get_value_mut();
519 let old_size = entry.metadata.size;
520 let meta = &mut entry.metadata.algorithm;
521 let old_priority_key = (meta.priority * 1000.0) as u64;
522 let frequency = meta.frequency;
523
524 // Remove from old priority list
525 let list = self.priority_lists.get_mut(&old_priority_key).unwrap();
526 let boxed_entry = list.remove(node).unwrap();
527
528 if list.is_empty() {
529 self.priority_lists.remove(&old_priority_key);
530 }
531
532 let entry_ptr = Box::into_raw(boxed_entry);
533 let cache_entry = (*entry_ptr).take_value();
534 // Discard old key/value since replacement is not eviction
535 let _ = (cache_entry.key, cache_entry.value);
536 let _ = Box::from_raw(entry_ptr);
537
538 // Update size tracking
539 self.current_size = self.current_size.saturating_sub(old_size);
540 self.current_size += size;
541
542 // Create new entry with updated values but preserved frequency
543 let new_priority = self.calculate_priority(frequency, size);
544 let new_priority_key = (new_priority * 1000.0) as u64;
545
546 let new_entry = CacheEntry::with_algorithm_metadata(
547 key.clone(),
548 val,
549 size,
550 GdsfMeta::new(frequency, new_priority),
551 );
552
553 let capacity = self.cap();
554 let list = self
555 .priority_lists
556 .entry(new_priority_key)
557 .or_insert_with(|| List::new(capacity));
558
559 if let Some(new_node) = list.add(new_entry) {
560 self.map.insert(key, new_node);
561 self.metrics.core.record_size_change(old_size, size);
562 self.metrics.core.bytes_written_to_cache += size;
563 // Replacement is not eviction - don't return the old value
564 return None;
565 } else {
566 self.map.remove(&key);
567 return None;
568 }
569 }
570 }
571
572 // New entry - check capacity and size limits
573 let capacity = self.config.capacity.get();
574 let max_size = self.config.max_size;
575
576 let mut evicted = Vec::new();
577 while self.len() >= capacity
578 || (self.current_size + size > max_size && !self.map.is_empty())
579 {
580 if let Some(entry) = self.evict() {
581 self.metrics.core.evictions += 1;
582 evicted.push(entry);
583 } else {
584 break;
585 }
586 }
587
588 let priority = self.calculate_priority(1, size);
589 let priority_key = (priority * 1000.0) as u64;
590
591 let cap = self.config.capacity;
592 let list = self
593 .priority_lists
594 .entry(priority_key)
595 .or_insert_with(|| List::new(cap));
596
597 let cache_entry =
598 CacheEntry::with_algorithm_metadata(key.clone(), val, size, GdsfMeta::new(1, priority));
599
600 if let Some(node) = list.add(cache_entry) {
601 self.map.insert(key, node);
602 self.current_size += size;
603
604 if self.len() == 1 || priority < self.min_priority {
605 self.min_priority = priority;
606 }
607
608 self.metrics.core.record_insertion(size);
609 self.metrics
610 .record_item_cached(size, self.metrics.average_item_size());
611 self.metrics.record_item_access(1, size, priority);
612 }
613
614 if evicted.is_empty() {
615 None
616 } else {
617 Some(evicted)
618 }
619 }
620
621 /// Removes and returns the eviction candidate (lowest priority entry).
622 ///
623 /// Also updates the global age to the evicted item's priority (GDSF aging).
624 ///
625 /// This method does **not** increment the eviction counter in metrics.
626 /// Eviction metrics are only recorded when the cache internally evicts
627 /// entries to make room during `put()` operations.
628 ///
629 /// Returns `None` if the cache is empty.
630 fn evict(&mut self) -> Option<(K, V)> {
631 if self.is_empty() {
632 return None;
633 }
634
635 let min_priority_key = *self.priority_lists.keys().next()?;
636 let list = self.priority_lists.get_mut(&min_priority_key)?;
637 let entry = list.remove_last()?;
638
639 unsafe {
640 // SAFETY: take_value moves the CacheEntry out by value.
641 // Box::from_raw frees memory (MaybeUninit won't double-drop).
642 let entry_ptr = Box::into_raw(entry);
643 let cache_entry = (*entry_ptr).take_value();
644 let evicted_size = cache_entry.metadata.size;
645 let priority_to_update = cache_entry.metadata.algorithm.priority;
646
647 // Update global age to the evicted item's priority (GDSF aging)
648 self.global_age = priority_to_update;
649 self.metrics.core.record_removal(evicted_size);
650 self.metrics.record_size_based_eviction();
651 self.metrics.record_aging_event(priority_to_update);
652
653 self.map.remove(&cache_entry.key);
654 self.current_size = self.current_size.saturating_sub(evicted_size);
655
656 // Remove empty priority list
657 if list.is_empty() {
658 self.priority_lists.remove(&min_priority_key);
659 }
660
661 let _ = Box::from_raw(entry_ptr);
662 Some((cache_entry.key, cache_entry.value))
663 }
664 }
665
666 pub(crate) fn remove<Q>(&mut self, key: &Q) -> Option<V>
667 where
668 K: Borrow<Q>,
669 Q: ?Sized + Hash + Eq,
670 {
671 if let Some(node) = self.map.remove(key) {
672 unsafe {
673 // SAFETY: node comes from our map
674 // Read priority before removal — needed to find the correct priority list
675 let priority = (*node).get_value().metadata.algorithm.priority;
676 let priority_key = (priority * 1000.0) as u64;
677
678 let list = self.priority_lists.get_mut(&priority_key).unwrap();
679 let boxed_entry = list.remove(node).unwrap();
680
681 if list.is_empty() {
682 self.priority_lists.remove(&priority_key);
683 }
684
685 let entry_ptr = Box::into_raw(boxed_entry);
686 let cache_entry = (*entry_ptr).take_value();
687 let removed_size = cache_entry.metadata.size;
688 self.current_size = self.current_size.saturating_sub(removed_size);
689 self.metrics.core.record_removal(removed_size);
690 let _ = Box::from_raw(entry_ptr);
691
692 Some(cache_entry.value)
693 }
694 } else {
695 None
696 }
697 }
698
699 pub(crate) fn clear(&mut self) {
700 self.map.clear();
701 self.priority_lists.clear();
702 self.global_age = 0.0;
703 self.min_priority = 0.0;
704 self.current_size = 0;
705 }
706
707 /// Check if key exists without updating its priority or access metadata.
708 #[inline]
709 pub(crate) fn contains<Q>(&self, key: &Q) -> bool
710 where
711 K: Borrow<Q>,
712 Q: ?Sized + Hash + Eq,
713 {
714 self.map.contains_key(key)
715 }
716
717 /// Returns a reference to the value without updating priority or access metadata.
718 ///
719 /// Unlike `get()`, this method does NOT recalculate the entry's GDSF priority
720 /// or change its position in the priority lists.
721 pub(crate) fn peek<Q>(&self, key: &Q) -> Option<&V>
722 where
723 K: Borrow<Q>,
724 Q: ?Sized + Hash + Eq,
725 {
726 let &node = self.map.get(key)?;
727 unsafe {
728 // SAFETY: node comes from our map, so it's a valid pointer
729 let entry = (*node).get_value();
730 Some(&entry.value)
731 }
732 }
733}
734
735impl<K, V, S> core::fmt::Debug for GdsfSegment<K, V, S> {
736 fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
737 f.debug_struct("GdsfSegment")
738 .field("capacity", &self.config.capacity)
739 .field("len", &self.map.len())
740 .field("global_age", &self.global_age)
741 .finish()
742 }
743}
744
745/// An implementation of a Greedy Dual-Size Frequency (GDSF) cache.
746#[derive(Debug)]
747pub struct GdsfCache<K, V, S = DefaultHashBuilder> {
748 segment: GdsfSegment<K, V, S>,
749}
750
751impl<K: Hash + Eq, V: Clone, S: BuildHasher> GdsfCache<K, V, S> {
752 #[inline]
753 pub fn cap(&self) -> NonZeroUsize {
754 self.segment.cap()
755 }
756
757 #[inline]
758 pub fn len(&self) -> usize {
759 self.segment.len()
760 }
761
762 #[inline]
763 pub fn is_empty(&self) -> bool {
764 self.segment.is_empty()
765 }
766
767 /// Returns the current total size of cached content.
768 #[inline]
769 pub fn current_size(&self) -> u64 {
770 self.segment.current_size()
771 }
772
773 /// Returns the maximum content size the cache can hold.
774 #[inline]
775 pub fn max_size(&self) -> u64 {
776 self.segment.max_size()
777 }
778
779 #[inline]
780 pub fn global_age(&self) -> f64 {
781 self.segment.global_age()
782 }
783
784 #[inline]
785 pub fn record_miss(&mut self, object_size: u64) {
786 self.segment.record_miss(object_size);
787 }
788
789 #[inline]
790 pub fn get<Q>(&mut self, key: &Q) -> Option<V>
791 where
792 K: Borrow<Q> + Clone,
793 Q: ?Sized + Hash + Eq,
794 {
795 self.segment.get(key)
796 }
797
798 #[inline]
799 pub fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V>
800 where
801 K: Borrow<Q> + Clone,
802 Q: ?Sized + Hash + Eq,
803 {
804 self.segment.get_mut(key)
805 }
806
807 /// Inserts a key-value pair with explicit size into the cache.
808 ///
809 /// GDSF is inherently size-aware: the `size` parameter affects priority
810 /// calculation (`priority = global_age + frequency / size`), favoring
811 /// small, frequently-accessed items.
812 ///
813 /// # Arguments
814 ///
815 /// * `key` - The key to insert
816 /// * `val` - The value to insert
817 /// * `size` - Optional size in bytes. Use `SIZE_UNIT` (1) for count-based caching.
818 ///
819 /// # Returns
820 ///
821 /// - `Some(vec)` containing evicted entries (not replaced entries)
822 /// - `None` if no entries were evicted (zero allocation)
823 #[inline]
824 pub fn put(&mut self, key: K, val: V, size: u64) -> Option<Vec<(K, V)>>
825 where
826 K: Clone,
827 {
828 self.segment.put(key, val, size)
829 }
830
831 #[inline]
832 pub fn clear(&mut self) {
833 self.segment.clear()
834 }
835
836 /// Check if key exists without updating its priority or access metadata.
837 ///
838 /// Unlike `get()`, this method does NOT update the entry's frequency
839 /// or access metadata.
840 ///
841 /// # Example
842 ///
843 /// ```
844 /// use cache_rs::GdsfCache;
845 /// use cache_rs::config::GdsfCacheConfig;
846 /// use core::num::NonZeroUsize;
847 ///
848 /// let config = GdsfCacheConfig {
849 /// capacity: NonZeroUsize::new(2).unwrap(),
850 /// initial_age: 0.0,
851 /// max_size: u64::MAX,
852 /// };
853 /// let mut cache = GdsfCache::init(config, None);
854 /// cache.put("a", 1, 10);
855 /// cache.put("b", 2, 10);
856 ///
857 /// // contains() does NOT update priority
858 /// assert!(cache.contains(&"a"));
859 /// ```
860 #[inline]
861 pub fn contains<Q>(&self, key: &Q) -> bool
862 where
863 K: Borrow<Q>,
864 Q: ?Sized + Hash + Eq,
865 {
866 self.segment.contains(key)
867 }
868
869 /// Returns a reference to the value without updating priority or access metadata.
870 ///
871 /// Unlike [`get()`](Self::get), this does NOT recalculate the entry's GDSF
872 /// priority or change its position in the priority lists.
873 ///
874 /// # Example
875 ///
876 /// ```
877 /// use cache_rs::GdsfCache;
878 /// use cache_rs::config::GdsfCacheConfig;
879 /// use core::num::NonZeroUsize;
880 ///
881 /// let config = GdsfCacheConfig {
882 /// capacity: NonZeroUsize::new(3).unwrap(),
883 /// initial_age: 0.0,
884 /// max_size: u64::MAX,
885 /// };
886 /// let mut cache = GdsfCache::init(config, None);
887 /// cache.put("a", 1, 1);
888 ///
889 /// // peek does not change priority
890 /// assert_eq!(cache.peek(&"a"), Some(&1));
891 /// assert_eq!(cache.peek(&"missing"), None);
892 /// ```
893 #[inline]
894 pub fn peek<Q>(&self, key: &Q) -> Option<&V>
895 where
896 K: Borrow<Q>,
897 Q: ?Sized + Hash + Eq,
898 {
899 self.segment.peek(key)
900 }
901
902 /// Removes a key from the cache, returning the value if the key was present.
903 #[inline]
904 pub fn remove<Q>(&mut self, key: &Q) -> Option<V>
905 where
906 K: Borrow<Q>,
907 Q: ?Sized + Hash + Eq,
908 {
909 self.segment.remove(key)
910 }
911}
912
913impl<K: Hash + Eq, V: Clone, S: BuildHasher> CacheMetrics for GdsfCache<K, V, S> {
914 fn metrics(&self) -> BTreeMap<String, f64> {
915 self.segment.metrics().metrics()
916 }
917
918 fn algorithm_name(&self) -> &'static str {
919 self.segment.metrics().algorithm_name()
920 }
921}
922
923impl<K: Hash + Eq, V: Clone> GdsfCache<K, V, DefaultHashBuilder> {
924 /// Creates a new GDSF cache from a configuration.
925 ///
926 /// This is the **recommended** way to create a GDSF cache. All configuration
927 /// is specified through the [`GdsfCacheConfig`] struct.
928 ///
929 /// # Arguments
930 ///
931 /// * `config` - Configuration specifying capacity and optional size limit/initial age
932 /// * `hasher` - Optional custom hash builder. Pass `None` to use the default.
933 ///
934 /// # Example
935 ///
936 /// ```
937 /// use cache_rs::GdsfCache;
938 /// use cache_rs::config::GdsfCacheConfig;
939 /// use core::num::NonZeroUsize;
940 ///
941 /// // Simple capacity-only cache
942 /// let config = GdsfCacheConfig {
943 /// capacity: NonZeroUsize::new(100).unwrap(),
944 /// initial_age: 0.0,
945 /// max_size: u64::MAX,
946 /// };
947 /// let mut cache: GdsfCache<&str, i32> = GdsfCache::init(config, None);
948 /// cache.put("key", 42, 1);
949 ///
950 /// // Cache with size limit (recommended for GDSF)
951 /// let config = GdsfCacheConfig {
952 /// capacity: NonZeroUsize::new(1000).unwrap(),
953 /// initial_age: 0.0,
954 /// max_size: 10 * 1024 * 1024, // 10MB
955 /// };
956 /// let cache: GdsfCache<String, Vec<u8>> = GdsfCache::init(config, None);
957 /// ```
958 pub fn init(config: GdsfCacheConfig, hasher: Option<DefaultHashBuilder>) -> Self {
959 GdsfCache {
960 segment: GdsfSegment::init(config, hasher.unwrap_or_default()),
961 }
962 }
963}
964
965#[cfg(test)]
966mod tests {
967 use super::*;
968 use crate::config::GdsfCacheConfig;
969 use core::num::NonZeroUsize;
970
971 /// Helper to create a GdsfCache with the given capacity
972 fn make_cache<K: Hash + Eq + Clone, V: Clone>(cap: usize) -> GdsfCache<K, V> {
973 let config = GdsfCacheConfig {
974 capacity: NonZeroUsize::new(cap).unwrap(),
975 initial_age: 0.0,
976 max_size: u64::MAX,
977 };
978 GdsfCache::init(config, None)
979 }
980
981 #[test]
982 fn test_gdsf_basic_operations() {
983 let mut cache = make_cache(3);
984
985 assert_eq!(cache.put("a", 1, 1), None);
986 assert_eq!(cache.put("b", 2, 2), None);
987 assert_eq!(cache.put("c", 3, 1), None);
988 assert_eq!(cache.len(), 3);
989
990 assert_eq!(cache.get(&"a"), Some(1));
991 assert_eq!(cache.get(&"b"), Some(2));
992 assert_eq!(cache.get(&"c"), Some(3));
993
994 assert!(cache.contains(&"a"));
995 assert!(!cache.contains(&"d"));
996 }
997
998 #[test]
999 fn test_gdsf_frequency_priority() {
1000 let mut cache = make_cache(2);
1001
1002 cache.put("a", 1, 1);
1003 cache.put("b", 2, 1);
1004
1005 cache.get(&"a");
1006 cache.get(&"a");
1007
1008 cache.put("c", 3, 1);
1009
1010 assert!(cache.contains(&"a"));
1011 assert!(!cache.contains(&"b"));
1012 assert!(cache.contains(&"c"));
1013 }
1014
1015 #[test]
1016 fn test_gdsf_size_consideration() {
1017 let mut cache = make_cache(2);
1018
1019 cache.put("small", 1, 1);
1020 cache.put("large", 2, 10);
1021
1022 cache.put("medium", 3, 5);
1023
1024 assert!(cache.contains(&"small"));
1025 assert!(!cache.contains(&"large"));
1026 assert!(cache.contains(&"medium"));
1027 }
1028
1029 #[test]
1030 fn test_gdsf_update_existing() {
1031 let mut cache = make_cache(2);
1032
1033 cache.put("key", 1, 1);
1034 assert_eq!(cache.get(&"key"), Some(1));
1035
1036 // Replacement returns None (not eviction)
1037 assert!(cache.put("key", 2, 2).is_none());
1038 assert_eq!(cache.get(&"key"), Some(2));
1039 assert_eq!(cache.len(), 1);
1040 }
1041
1042 #[test]
1043 fn test_gdsf_zero_size_rejection() {
1044 let mut cache = make_cache(2);
1045
1046 assert_eq!(cache.put("key", 1, 0), None);
1047 assert_eq!(cache.len(), 0);
1048 assert!(!cache.contains(&"key"));
1049 }
1050
1051 #[test]
1052 fn test_gdsf_remove_by_key_legacy() {
1053 let mut cache = make_cache(2);
1054
1055 cache.put("a", 1, 1);
1056 cache.put("b", 2, 1);
1057
1058 assert_eq!(cache.remove(&"a"), Some(1));
1059 assert_eq!(cache.len(), 1);
1060 assert!(!cache.contains(&"a"));
1061 assert!(cache.contains(&"b"));
1062
1063 assert_eq!(cache.remove(&"nonexistent"), None);
1064 }
1065
1066 #[test]
1067 fn test_gdsf_clear() {
1068 let mut cache = make_cache(2);
1069
1070 cache.put("a", 1, 1);
1071 cache.put("b", 2, 1);
1072 assert_eq!(cache.len(), 2);
1073
1074 cache.clear();
1075 assert_eq!(cache.len(), 0);
1076 assert!(cache.is_empty());
1077 assert!(!cache.contains(&"a"));
1078 assert!(!cache.contains(&"b"));
1079 }
1080
1081 #[test]
1082 fn test_gdsf_global_aging() {
1083 let mut cache = make_cache(2);
1084
1085 cache.put("a", 1, 1);
1086 cache.put("b", 2, 1);
1087
1088 let initial_age = cache.global_age();
1089
1090 cache.put("c", 3, 1);
1091
1092 assert!(cache.global_age() > initial_age);
1093 }
1094
1095 #[test]
1096 fn test_miri_stacked_borrows_fix() {
1097 let mut cache = make_cache(10);
1098
1099 cache.put("a", 1, 10);
1100 cache.put("b", 2, 20);
1101 cache.put("c", 3, 15);
1102
1103 for _ in 0..3 {
1104 assert_eq!(cache.get(&"a"), Some(1));
1105 assert_eq!(cache.get(&"b"), Some(2));
1106 assert_eq!(cache.get(&"c"), Some(3));
1107 }
1108
1109 assert_eq!(cache.len(), 3);
1110
1111 if let Some(val) = cache.get_mut(&"a") {
1112 *val += 10;
1113 }
1114 assert_eq!(cache.get(&"a"), Some(11));
1115 }
1116
1117 #[test]
1118 fn test_gdsf_segment_directly() {
1119 let config = GdsfCacheConfig {
1120 capacity: NonZeroUsize::new(2).unwrap(),
1121 initial_age: 0.0,
1122 max_size: u64::MAX,
1123 };
1124 let mut segment: GdsfSegment<&str, i32, DefaultHashBuilder> =
1125 GdsfSegment::init(config, DefaultHashBuilder::default());
1126 assert_eq!(segment.len(), 0);
1127 assert!(segment.is_empty());
1128 assert_eq!(segment.cap().get(), 2);
1129 segment.put("a", 1, 1);
1130 segment.put("b", 2, 2);
1131 assert_eq!(segment.len(), 2);
1132 assert_eq!(segment.get(&"a"), Some(1));
1133 assert_eq!(segment.get(&"b"), Some(2));
1134 }
1135
1136 #[test]
1137 fn test_gdsf_concurrent_access() {
1138 extern crate std;
1139 use std::sync::{Arc, Mutex};
1140 use std::thread;
1141 use std::vec::Vec;
1142
1143 let cache = Arc::new(Mutex::new(make_cache::<String, i32>(100)));
1144 let num_threads = 4;
1145 let ops_per_thread = 100;
1146
1147 let mut handles: Vec<std::thread::JoinHandle<()>> = Vec::new();
1148
1149 for t in 0..num_threads {
1150 let cache = Arc::clone(&cache);
1151 handles.push(thread::spawn(move || {
1152 for i in 0..ops_per_thread {
1153 let key = std::format!("key_{}_{}", t, i);
1154 let size = ((i % 10) + 1) as u64; // Varying sizes 1-10
1155 let mut guard = cache.lock().unwrap();
1156 guard.put(key.clone(), i, size);
1157 let _ = guard.get(&key);
1158 }
1159 }));
1160 }
1161
1162 for handle in handles {
1163 handle.join().unwrap();
1164 }
1165
1166 let mut guard = cache.lock().unwrap();
1167 assert!(guard.len() <= 100);
1168 guard.clear(); // Clean up for MIRI
1169 }
1170
1171 #[test]
1172 fn test_gdsf_size_aware_tracking() {
1173 let mut cache = make_cache(10);
1174
1175 assert_eq!(cache.current_size(), 0);
1176 assert_eq!(cache.max_size(), u64::MAX);
1177
1178 // GDSF already requires size in put()
1179 cache.put("a", 1, 100);
1180 cache.put("b", 2, 200);
1181 cache.put("c", 3, 150);
1182
1183 assert_eq!(cache.current_size(), 450);
1184 assert_eq!(cache.len(), 3);
1185
1186 // GDSF doesn't have remove method, test clear instead
1187 cache.clear();
1188 assert_eq!(cache.current_size(), 0);
1189 }
1190
1191 #[test]
1192 fn test_gdsf_init_constructor() {
1193 let config = GdsfCacheConfig {
1194 capacity: NonZeroUsize::new(1000).unwrap(),
1195 initial_age: 0.0,
1196 max_size: 1024 * 1024,
1197 };
1198 let cache: GdsfCache<String, i32> = GdsfCache::init(config, None);
1199
1200 assert_eq!(cache.current_size(), 0);
1201 assert_eq!(cache.max_size(), 1024 * 1024);
1202 }
1203
1204 #[test]
1205 fn test_gdsf_with_limits_constructor() {
1206 let config = GdsfCacheConfig {
1207 capacity: NonZeroUsize::new(100).unwrap(),
1208 initial_age: 0.0,
1209 max_size: 1024 * 1024,
1210 };
1211 let cache: GdsfCache<String, String> = GdsfCache::init(config, None);
1212
1213 assert_eq!(cache.current_size(), 0);
1214 assert_eq!(cache.max_size(), 1024 * 1024);
1215 assert_eq!(cache.cap().get(), 100);
1216 }
1217
1218 #[test]
1219 fn test_gdsf_contains_non_promoting() {
1220 let mut cache = make_cache(2);
1221 cache.put("a", 1, 10);
1222 cache.put("b", 2, 10);
1223
1224 // contains() should return true for existing keys
1225 assert!(cache.contains(&"a"));
1226 assert!(cache.contains(&"b"));
1227 assert!(!cache.contains(&"c"));
1228
1229 // Access "b" to increase its priority
1230 cache.get(&"b");
1231
1232 // contains() should NOT increase priority of "a"
1233 assert!(cache.contains(&"a"));
1234
1235 // Adding "c" should evict "a" (lowest priority), not "b"
1236 cache.put("c", 3, 10);
1237 assert!(!cache.contains(&"a")); // "a" was evicted
1238 assert!(cache.contains(&"b")); // "b" still exists
1239 assert!(cache.contains(&"c")); // "c" was just added
1240 }
1241
1242 #[test]
1243 fn test_put_returns_none_when_no_eviction() {
1244 let mut cache = make_cache(10);
1245 assert!(cache.put("a", 1, 10).is_none());
1246 assert!(cache.put("b", 2, 10).is_none());
1247 }
1248
1249 #[test]
1250 fn test_put_returns_single_eviction() {
1251 let mut cache = make_cache(2);
1252 cache.put("a", 1, 10);
1253 cache.put("b", 2, 10);
1254 let result = cache.put("c", 3, 10);
1255 assert!(result.is_some());
1256 let evicted = result.unwrap();
1257 assert_eq!(evicted.len(), 1);
1258 }
1259
1260 #[test]
1261 fn test_put_replacement_returns_none() {
1262 let mut cache = make_cache(10);
1263 cache.put("a", 1, 10);
1264 // Replacement is not eviction - returns None
1265 let result = cache.put("a", 2, 10);
1266 assert!(result.is_none());
1267 // Value should be updated
1268 assert_eq!(cache.get(&"a"), Some(2));
1269 }
1270
1271 #[test]
1272 fn test_put_returns_multiple_evictions_size_based() {
1273 let config = GdsfCacheConfig {
1274 capacity: NonZeroUsize::new(10).unwrap(),
1275 initial_age: 0.0,
1276 max_size: 100,
1277 };
1278 let mut cache = GdsfCache::init(config, None);
1279 for i in 0..10 {
1280 cache.put(i, i, 10);
1281 }
1282 let result = cache.put(99, 99, 50);
1283 let evicted = result.unwrap();
1284 assert_eq!(evicted.len(), 5);
1285 }
1286
1287 #[test]
1288 fn test_gdsf_remove_by_key() {
1289 let mut cache = make_cache(2);
1290 cache.put("a", 1, 10);
1291 cache.put("b", 2, 10);
1292
1293 // remove() works
1294 assert_eq!(cache.remove(&"a"), Some(1));
1295 assert_eq!(cache.len(), 1);
1296 assert!(!cache.contains(&"a"));
1297 assert!(cache.contains(&"b"));
1298
1299 assert_eq!(cache.remove(&"nonexistent"), None);
1300 }
1301}