Skip to main content

cache_rs/
gdsf.rs

1//! Greedy Dual-Size Frequency (GDSF) Cache Implementation
2//!
3//! GDSF is a sophisticated cache replacement algorithm designed for **variable-sized objects**.
4//! It combines object size, access frequency, and aging into a unified priority formula,
5//! making it ideal for CDN caches, image caches, and any scenario where cached objects
6//! have different sizes.
7//!
8//! # How the Algorithm Works
9//!
10//! GDSF calculates priority for each cached item using the formula:
11//!
12//! ```text
13//! Priority = (Frequency / Size) + GlobalAge
14//! ```
15//!
16//! This formula cleverly balances multiple factors:
17//! - **Frequency**: More frequently accessed items get higher priority
18//! - **Size**: Smaller items are favored (more items fit in cache)
19//! - **Aging**: Prevents old popular items from staying forever
20//!
21//! ## Mathematical Formulation
22//!
23//! ```text
24//! For each cache entry i:
25//!   - F_i = access frequency of item i
26//!   - S_i = size of item i (in bytes)
27//!   - GlobalAge = increases on eviction (set to evicted item's priority)
28//!   - Priority_i = (F_i / S_i) + GlobalAge
29//!
30//! On eviction: select item j where Priority_j = min{Priority_i for all i}
31//! ```
32//!
33//! ## Why Size Matters
34//!
35//! Consider a cache with 10KB capacity:
36//! - Option A: Cache one 10KB file accessed 10 times → Priority = 10/10000 = 0.001
37//! - Option B: Cache ten 1KB files accessed once each → Each Priority = 1/1000 = 0.001
38//!
39//! GDSF recognizes that caching many small frequently-accessed items often yields
40//! better hit rates than caching fewer large items.
41//!
42//! ## Data Structure
43//!
44//! ```text
45//! ┌─────────────────────────────────────────────────────────────────────────────┐
46//! │                    GDSF Cache (global_age=1.5, max_size=10MB)               │
47//! │                                                                              │
48//! │  HashMap<K, *Node>              BTreeMap<Priority, List>                     │
49//! │  ┌──────────────┐              ┌─────────────────────────────────────────┐   │
50//! │  │ "icon.png" ─────────────────│ pri=3.5: [icon.png]   (f=2, s=1KB)      │   │
51//! │  │ "thumb.jpg" ────────────────│ pri=2.5: [thumb.jpg]  (f=1, s=1KB)      │   │
52//! │  │ "video.mp4" ────────────────│ pri=1.5: [video.mp4]  (f=10, s=100MB)←E │   │
53//! │  └──────────────┘              └─────────────────────────────────────────┘   │
54//! │                                        ▲                                     │
55//! │                                        │                                     │
56//! │                                   min_priority=1.5                           │
57//! │                                                                              │
58//! │  Note: video.mp4 has high frequency (10) but large size (100MB),            │
59//! │        so its priority = 10/100000000 + 1.5 ≈ 1.5 (eviction candidate)      │
60//! └─────────────────────────────────────────────────────────────────────────────┘
61//! ```
62//!
63//! ## Operations
64//!
65//! | Operation | Action | Time |
66//! |-----------|--------|------|
67//! | `get(key)` | Increment frequency, recalculate priority | O(log P) |
68//! | `put(key, value, size)` | Insert with priority=(1/size)+age | O(log P) |
69//! | `remove(key)` | Remove from priority list, update min_priority | O(log P) |
70//!
71//! ## Size-Aware Example
72//!
73//! ```text
74//! Cache: max_size=5KB, global_age=0
75//!
76//! put("small.txt", data, 1KB)   →  pri=1.0, total_size=1KB
77//! put("medium.txt", data, 2KB)  →  pri=0.5, total_size=3KB
78//! put("large.txt", data, 3KB)   →  pri=0.33, total_size=6KB  OVERFLOW!
79//!
80//! Eviction needed: evict "large.txt" (lowest priority=0.33)
81//! global_age = 0.33
82//!
83//! put("large.txt", data, 3KB)   →  pri=0.33+0.33=0.66, total_size=6KB  OVERFLOW!
84//!
85//! Evict again: now "medium.txt" has lowest priority (0.5 < 0.66)
86//! Result: small.txt + large.txt fit in 4KB
87//! ```
88//!
89//! # Dual-Limit Capacity
90//!
91//! GDSF naturally works with size-based limits:
92//!
93//! - **`max_entries`**: Maximum number of items (prevents too many tiny items)
94//! - **`max_size`**: Maximum total size (primary constraint for GDSF)
95//!
96//! # Performance Characteristics
97//!
98//! | Metric | Value |
99//! |--------|-------|
100//! | Get | O(log P) |
101//! | Put | O(log P) amortized |
102//! | Remove | O(log P) |
103//! | Memory per entry | ~120 bytes overhead + key×2 + value |
104//!
105//! Where P = number of distinct priority buckets. Priority = (frequency/size) + age,
106//! quantized to integer keys. BTreeMap provides O(log P) lookups.
107//!
108//! Higher overhead than simpler algorithms due to priority calculation and
109//! BTreeMap-based priority lists.
110//!
111//! # When to Use GDSF
112//!
113//! **Good for:**
114//! - CDN and proxy caches with variable object sizes
115//! - Image thumbnail caches
116//! - API response caches with varying payload sizes
117//! - File system caches
118//! - Any size-constrained cache with heterogeneous objects
119//!
120//! **Not ideal for:**
121//! - Uniform-size objects (simpler algorithms work equally well)
122//! - Entry-count-constrained caches (LRU/LFU are simpler)
123//! - Very small caches (overhead not justified)
124//!
125//! # GDSF vs Other Algorithms
126//!
127//! | Aspect | LRU | LFU | GDSF |
128//! |--------|-----|-----|------|
129//! | Size-aware | No | No | **Yes** |
130//! | Frequency-aware | No | Yes | Yes |
131//! | Aging | Implicit | No | Yes |
132//! | Best for | Recency | Frequency | Variable-size objects |
133//!
134//! # Thread Safety
135//!
136//! `GdsfCache` is **not thread-safe**. For concurrent access, either:
137//! - Wrap with `Mutex` or `RwLock`
138//! - Use `ConcurrentGdsfCache` (requires `concurrent` feature)
139//!
140//! # Examples
141//!
142//! ## Basic Usage
143//!
144//! ```
145//! use cache_rs::GdsfCache;
146//! use cache_rs::config::GdsfCacheConfig;
147//! use core::num::NonZeroUsize;
148//!
149//! // Create cache with max 1000 entries and 10MB size limit
150//! let config = GdsfCacheConfig {
151//!     capacity: NonZeroUsize::new(1000).unwrap(),
152//!     initial_age: 0.0,
153//!     max_size: 10 * 1024 * 1024,  // 10MB
154//! };
155//! let mut cache: GdsfCache<String, Vec<u8>> = GdsfCache::init(config, None);
156//!
157//! // Insert with explicit size tracking
158//! let small_data = vec![0u8; 1024];  // 1KB
159//! cache.put("small.txt".to_string(), small_data, 1024);
160//!
161//! let large_data = vec![0u8; 1024 * 1024];  // 1MB
162//! cache.put("large.bin".to_string(), large_data, 1024 * 1024);
163//!
164//! // Small items get higher priority per byte
165//! assert!(cache.get(&"small.txt".to_string()).is_some());
166//! ```
167//!
168//! ## CDN-Style Caching
169//!
170//! ```
171//! use cache_rs::GdsfCache;
172//! use cache_rs::config::GdsfCacheConfig;
173//! use core::num::NonZeroUsize;
174//!
175//! // 100MB cache for web assets
176//! let config = GdsfCacheConfig {
177//!     capacity: NonZeroUsize::new(10000).unwrap(),
178//!     initial_age: 0.0,
179//!     max_size: 100 * 1024 * 1024,
180//! };
181//! let mut cache: GdsfCache<String, Vec<u8>> = GdsfCache::init(config, None);
182//!
183//! // Cache various asset types with their sizes
184//! fn cache_asset(cache: &mut GdsfCache<String, Vec<u8>>, url: &str, data: Vec<u8>) {
185//!     let size = data.len() as u64;
186//!     cache.put(url.to_string(), data, size);
187//! }
188//!
189//! // Small, frequently-accessed assets get priority over large, rarely-used ones
190//! ```
191
192extern crate alloc;
193
194use crate::config::GdsfCacheConfig;
195use crate::entry::CacheEntry;
196use crate::list::{List, ListEntry};
197use crate::metrics::{CacheMetrics, GdsfCacheMetrics};
198
199/// Metadata for GDSF (Greedy Dual-Size Frequency) cache entries.
200///
201/// GDSF is a sophisticated algorithm that considers:
202/// - **Frequency**: How often the item is accessed
203/// - **Size**: Larger items have lower priority per byte
204/// - **Aging**: Global clock advances when items are evicted
205///
206/// # Priority Calculation
207///
208/// ```text
209/// priority = (frequency / size) + global_age
210/// ```
211///
212/// Items with lower priority are evicted first.
213///
214/// # Examples
215///
216/// ```
217/// use cache_rs::gdsf::GdsfMeta;
218///
219/// let meta = GdsfMeta::new(1, 0.5); // frequency=1, priority=0.5
220/// assert_eq!(meta.frequency, 1);
221/// assert_eq!(meta.priority, 0.5);
222/// ```
223#[derive(Debug, Clone, Copy, Default, PartialEq)]
224pub struct GdsfMeta {
225    /// Access frequency count.
226    pub frequency: u64,
227
228    /// Calculated priority: (frequency / size) + clock.
229    /// Lower priority = more likely to be evicted.
230    pub priority: f64,
231}
232
233impl GdsfMeta {
234    /// Creates a new GDSF metadata with the specified frequency and priority.
235    ///
236    /// # Arguments
237    ///
238    /// * `frequency` - Initial frequency count
239    /// * `priority` - Calculated priority value
240    #[inline]
241    pub fn new(frequency: u64, priority: f64) -> Self {
242        Self {
243            frequency,
244            priority,
245        }
246    }
247
248    /// Increments the frequency counter and returns the new value.
249    #[inline]
250    pub fn increment(&mut self) -> u64 {
251        self.frequency += 1;
252        self.frequency
253    }
254
255    /// Calculates and updates the priority based on frequency, size, and global age.
256    ///
257    /// # Arguments
258    ///
259    /// * `size` - Size of the cached item
260    /// * `global_age` - Current global age (clock value) of the cache
261    ///
262    /// # Returns
263    ///
264    /// The newly calculated priority value.
265    #[inline]
266    pub fn calculate_priority(&mut self, size: u64, global_age: f64) -> f64 {
267        self.priority = if size == 0 {
268            f64::INFINITY
269        } else {
270            (self.frequency as f64 / size as f64) + global_age
271        };
272        self.priority
273    }
274}
275use alloc::boxed::Box;
276use alloc::collections::BTreeMap;
277use alloc::string::String;
278use alloc::vec::Vec;
279use core::borrow::Borrow;
280use core::hash::{BuildHasher, Hash};
281use core::num::NonZeroUsize;
282
283#[cfg(feature = "hashbrown")]
284use hashbrown::DefaultHashBuilder;
285#[cfg(feature = "hashbrown")]
286use hashbrown::HashMap;
287
288#[cfg(not(feature = "hashbrown"))]
289use std::collections::hash_map::RandomState as DefaultHashBuilder;
290#[cfg(not(feature = "hashbrown"))]
291use std::collections::HashMap;
292
293/// Internal GDSF segment containing the actual cache algorithm.
294///
295/// Uses `CacheEntry<K, V, GdsfMeta>` as the unified entry type. The map stores
296/// raw pointers to list nodes, and all entry data (key, value, size, metadata)
297/// is stored in the `CacheEntry`.
298pub(crate) struct GdsfSegment<K, V, S = DefaultHashBuilder> {
299    config: GdsfCacheConfig,
300    global_age: f64,
301    min_priority: f64,
302    /// Maps keys to node pointers. The node contains CacheEntry with all data.
303    map: HashMap<K, *mut ListEntry<CacheEntry<K, V, GdsfMeta>>, S>,
304    /// Priority lists: key is (priority * 1000) as u64 for BTreeMap ordering
305    priority_lists: BTreeMap<u64, List<CacheEntry<K, V, GdsfMeta>>>,
306    metrics: GdsfCacheMetrics,
307    /// Current total size of cached content (sum of entry sizes)
308    current_size: u64,
309}
310
311// SAFETY: GdsfSegment owns all data and raw pointers point only to nodes owned by
312// `priority_lists`. Concurrent access is safe when wrapped in proper synchronization primitives.
313unsafe impl<K: Send, V: Send, S: Send> Send for GdsfSegment<K, V, S> {}
314
315// SAFETY: All mutation requires &mut self; shared references cannot cause data races.
316unsafe impl<K: Send, V: Send, S: Sync> Sync for GdsfSegment<K, V, S> {}
317
318impl<K: Hash + Eq, V: Clone, S: BuildHasher> GdsfSegment<K, V, S> {
319    /// Creates a new GDSF segment from a configuration.
320    ///
321    /// This is the **only** way to create a GDSF segment. All configuration
322    /// is specified through the [`GdsfCacheConfig`] struct.
323    ///
324    /// # Arguments
325    ///
326    /// * `config` - Configuration specifying capacity, initial age, and optional size limit
327    /// * `hasher` - Hash builder for the internal HashMap
328    #[allow(dead_code)] // Used by concurrent module when feature is enabled
329    pub(crate) fn init(config: GdsfCacheConfig, hasher: S) -> Self {
330        let map_capacity = config.capacity.get().next_power_of_two();
331        GdsfSegment {
332            global_age: config.initial_age,
333            min_priority: 0.0,
334            map: HashMap::with_capacity_and_hasher(map_capacity, hasher),
335            priority_lists: BTreeMap::new(),
336            metrics: GdsfCacheMetrics::new(config.max_size),
337            current_size: 0,
338            config,
339        }
340    }
341
342    #[inline]
343    pub(crate) fn cap(&self) -> NonZeroUsize {
344        self.config.capacity
345    }
346
347    #[inline]
348    pub(crate) fn len(&self) -> usize {
349        self.map.len()
350    }
351
352    #[inline]
353    pub(crate) fn is_empty(&self) -> bool {
354        self.map.is_empty()
355    }
356
357    #[inline]
358    pub(crate) fn global_age(&self) -> f64 {
359        self.global_age
360    }
361
362    /// Returns the current total size of cached content.
363    #[inline]
364    pub(crate) fn current_size(&self) -> u64 {
365        self.current_size
366    }
367
368    /// Returns the maximum content size the cache can hold.
369    #[inline]
370    pub(crate) fn max_size(&self) -> u64 {
371        self.config.max_size
372    }
373
374    #[inline]
375    pub(crate) fn metrics(&self) -> &GdsfCacheMetrics {
376        &self.metrics
377    }
378
379    #[inline]
380    pub(crate) fn record_miss(&mut self, object_size: u64) {
381        self.metrics.core.record_miss(object_size);
382    }
383
384    fn calculate_priority(&self, frequency: u64, size: u64) -> f64 {
385        if size == 0 {
386            return f64::INFINITY;
387        }
388        (frequency as f64 / size as f64) + self.global_age
389    }
390
391    unsafe fn update_priority_by_node(
392        &mut self,
393        node: *mut ListEntry<CacheEntry<K, V, GdsfMeta>>,
394    ) -> *mut ListEntry<CacheEntry<K, V, GdsfMeta>>
395    where
396        K: Clone + Hash + Eq,
397    {
398        // SAFETY: node is guaranteed valid by caller's contract
399        let entry = (*node).get_value_mut();
400        let key_cloned = entry.key.clone();
401        let size = entry.metadata.size;
402        let meta = &mut entry.metadata.algorithm;
403        let old_priority = meta.priority;
404
405        meta.increment();
406
407        let global_age = self.global_age;
408        let new_priority = meta.calculate_priority(size, global_age);
409
410        let old_priority_key = (old_priority * 1000.0) as u64;
411        let new_priority_key = (new_priority * 1000.0) as u64;
412
413        if old_priority_key == new_priority_key {
414            self.priority_lists
415                .get_mut(&new_priority_key)
416                .unwrap()
417                .move_to_front(node);
418            return node;
419        }
420
421        let boxed_entry = self
422            .priority_lists
423            .get_mut(&old_priority_key)
424            .unwrap()
425            .remove(node)
426            .unwrap();
427
428        if self
429            .priority_lists
430            .get(&old_priority_key)
431            .unwrap()
432            .is_empty()
433        {
434            self.priority_lists.remove(&old_priority_key);
435        }
436
437        let entry_ptr = Box::into_raw(boxed_entry);
438
439        let capacity = self.config.capacity;
440        self.priority_lists
441            .entry(new_priority_key)
442            .or_insert_with(|| List::new(capacity));
443
444        self.priority_lists
445            .get_mut(&new_priority_key)
446            .unwrap()
447            .attach_from_other_list(entry_ptr);
448
449        // Update map with new node pointer
450        self.map.insert(key_cloned, entry_ptr);
451        entry_ptr
452    }
453
454    pub(crate) fn get<Q>(&mut self, key: &Q) -> Option<V>
455    where
456        K: Borrow<Q> + Clone,
457        Q: ?Sized + Hash + Eq,
458    {
459        if let Some(&node) = self.map.get(key) {
460            unsafe {
461                // SAFETY: node comes from our map
462                let entry = (*node).get_value();
463                let entry_size = entry.metadata.size;
464                let meta = &entry.metadata.algorithm;
465                self.metrics.core.record_hit(entry_size);
466                self.metrics
467                    .record_item_access(meta.frequency, entry.metadata.size, meta.priority);
468
469                let new_node = self.update_priority_by_node(node);
470                let value = (*new_node).get_value().value.clone();
471                Some(value)
472            }
473        } else {
474            None
475        }
476    }
477
478    pub(crate) fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V>
479    where
480        K: Borrow<Q> + Clone,
481        Q: ?Sized + Hash + Eq,
482    {
483        if let Some(&node) = self.map.get(key) {
484            unsafe {
485                // SAFETY: node comes from our map
486                let entry = (*node).get_value();
487                let entry_size = entry.metadata.size;
488                let meta = &entry.metadata.algorithm;
489                self.metrics.core.record_hit(entry_size);
490                self.metrics
491                    .record_item_access(meta.frequency, entry.metadata.size, meta.priority);
492
493                let new_node = self.update_priority_by_node(node);
494                let entry_mut = (*new_node).get_value_mut();
495                Some(&mut entry_mut.value)
496            }
497        } else {
498            None
499        }
500    }
501
502    /// Inserts a key-value pair with size tracking.
503    ///
504    /// Returns evicted entries, or `None` if no entries were evicted.
505    /// Note: Replacing an existing key does not return the old value.
506    pub(crate) fn put(&mut self, key: K, val: V, size: u64) -> Option<Vec<(K, V)>>
507    where
508        K: Clone,
509    {
510        if size == 0 {
511            return None;
512        }
513
514        // Check if key exists - update existing entry
515        if let Some(&node) = self.map.get(&key) {
516            unsafe {
517                // SAFETY: node comes from our map
518                let entry = (*node).get_value_mut();
519                let old_size = entry.metadata.size;
520                let meta = &mut entry.metadata.algorithm;
521                let old_priority_key = (meta.priority * 1000.0) as u64;
522                let frequency = meta.frequency;
523
524                // Remove from old priority list
525                let list = self.priority_lists.get_mut(&old_priority_key).unwrap();
526                let boxed_entry = list.remove(node).unwrap();
527
528                if list.is_empty() {
529                    self.priority_lists.remove(&old_priority_key);
530                }
531
532                let entry_ptr = Box::into_raw(boxed_entry);
533                let cache_entry = (*entry_ptr).take_value();
534                // Discard old key/value since replacement is not eviction
535                let _ = (cache_entry.key, cache_entry.value);
536                let _ = Box::from_raw(entry_ptr);
537
538                // Update size tracking
539                self.current_size = self.current_size.saturating_sub(old_size);
540                self.current_size += size;
541
542                // Create new entry with updated values but preserved frequency
543                let new_priority = self.calculate_priority(frequency, size);
544                let new_priority_key = (new_priority * 1000.0) as u64;
545
546                let new_entry = CacheEntry::with_algorithm_metadata(
547                    key.clone(),
548                    val,
549                    size,
550                    GdsfMeta::new(frequency, new_priority),
551                );
552
553                let capacity = self.cap();
554                let list = self
555                    .priority_lists
556                    .entry(new_priority_key)
557                    .or_insert_with(|| List::new(capacity));
558
559                if let Some(new_node) = list.add(new_entry) {
560                    self.map.insert(key, new_node);
561                    self.metrics.core.record_size_change(old_size, size);
562                    self.metrics.core.bytes_written_to_cache += size;
563                    // Replacement is not eviction - don't return the old value
564                    return None;
565                } else {
566                    self.map.remove(&key);
567                    return None;
568                }
569            }
570        }
571
572        // New entry - check capacity and size limits
573        let capacity = self.config.capacity.get();
574        let max_size = self.config.max_size;
575
576        let mut evicted = Vec::new();
577        while self.len() >= capacity
578            || (self.current_size + size > max_size && !self.map.is_empty())
579        {
580            if let Some(entry) = self.evict() {
581                self.metrics.core.evictions += 1;
582                evicted.push(entry);
583            } else {
584                break;
585            }
586        }
587
588        let priority = self.calculate_priority(1, size);
589        let priority_key = (priority * 1000.0) as u64;
590
591        let cap = self.config.capacity;
592        let list = self
593            .priority_lists
594            .entry(priority_key)
595            .or_insert_with(|| List::new(cap));
596
597        let cache_entry =
598            CacheEntry::with_algorithm_metadata(key.clone(), val, size, GdsfMeta::new(1, priority));
599
600        if let Some(node) = list.add(cache_entry) {
601            self.map.insert(key, node);
602            self.current_size += size;
603
604            if self.len() == 1 || priority < self.min_priority {
605                self.min_priority = priority;
606            }
607
608            self.metrics.core.record_insertion(size);
609            self.metrics
610                .record_item_cached(size, self.metrics.average_item_size());
611            self.metrics.record_item_access(1, size, priority);
612        }
613
614        if evicted.is_empty() {
615            None
616        } else {
617            Some(evicted)
618        }
619    }
620
621    /// Removes and returns the eviction candidate (lowest priority entry).
622    ///
623    /// Also updates the global age to the evicted item's priority (GDSF aging).
624    ///
625    /// This method does **not** increment the eviction counter in metrics.
626    /// Eviction metrics are only recorded when the cache internally evicts
627    /// entries to make room during `put()` operations.
628    ///
629    /// Returns `None` if the cache is empty.
630    fn evict(&mut self) -> Option<(K, V)> {
631        if self.is_empty() {
632            return None;
633        }
634
635        let min_priority_key = *self.priority_lists.keys().next()?;
636        let list = self.priority_lists.get_mut(&min_priority_key)?;
637        let entry = list.remove_last()?;
638
639        unsafe {
640            // SAFETY: take_value moves the CacheEntry out by value.
641            // Box::from_raw frees memory (MaybeUninit won't double-drop).
642            let entry_ptr = Box::into_raw(entry);
643            let cache_entry = (*entry_ptr).take_value();
644            let evicted_size = cache_entry.metadata.size;
645            let priority_to_update = cache_entry.metadata.algorithm.priority;
646
647            // Update global age to the evicted item's priority (GDSF aging)
648            self.global_age = priority_to_update;
649            self.metrics.core.record_removal(evicted_size);
650            self.metrics.record_size_based_eviction();
651            self.metrics.record_aging_event(priority_to_update);
652
653            self.map.remove(&cache_entry.key);
654            self.current_size = self.current_size.saturating_sub(evicted_size);
655
656            // Remove empty priority list
657            if list.is_empty() {
658                self.priority_lists.remove(&min_priority_key);
659            }
660
661            let _ = Box::from_raw(entry_ptr);
662            Some((cache_entry.key, cache_entry.value))
663        }
664    }
665
666    pub(crate) fn remove<Q>(&mut self, key: &Q) -> Option<V>
667    where
668        K: Borrow<Q>,
669        Q: ?Sized + Hash + Eq,
670    {
671        if let Some(node) = self.map.remove(key) {
672            unsafe {
673                // SAFETY: node comes from our map
674                // Read priority before removal — needed to find the correct priority list
675                let priority = (*node).get_value().metadata.algorithm.priority;
676                let priority_key = (priority * 1000.0) as u64;
677
678                let list = self.priority_lists.get_mut(&priority_key).unwrap();
679                let boxed_entry = list.remove(node).unwrap();
680
681                if list.is_empty() {
682                    self.priority_lists.remove(&priority_key);
683                }
684
685                let entry_ptr = Box::into_raw(boxed_entry);
686                let cache_entry = (*entry_ptr).take_value();
687                let removed_size = cache_entry.metadata.size;
688                self.current_size = self.current_size.saturating_sub(removed_size);
689                self.metrics.core.record_removal(removed_size);
690                let _ = Box::from_raw(entry_ptr);
691
692                Some(cache_entry.value)
693            }
694        } else {
695            None
696        }
697    }
698
699    pub(crate) fn clear(&mut self) {
700        self.map.clear();
701        self.priority_lists.clear();
702        self.global_age = 0.0;
703        self.min_priority = 0.0;
704        self.current_size = 0;
705    }
706
707    /// Check if key exists without updating its priority or access metadata.
708    #[inline]
709    pub(crate) fn contains<Q>(&self, key: &Q) -> bool
710    where
711        K: Borrow<Q>,
712        Q: ?Sized + Hash + Eq,
713    {
714        self.map.contains_key(key)
715    }
716
717    /// Returns a reference to the value without updating priority or access metadata.
718    ///
719    /// Unlike `get()`, this method does NOT recalculate the entry's GDSF priority
720    /// or change its position in the priority lists.
721    pub(crate) fn peek<Q>(&self, key: &Q) -> Option<&V>
722    where
723        K: Borrow<Q>,
724        Q: ?Sized + Hash + Eq,
725    {
726        let &node = self.map.get(key)?;
727        unsafe {
728            // SAFETY: node comes from our map, so it's a valid pointer
729            let entry = (*node).get_value();
730            Some(&entry.value)
731        }
732    }
733}
734
735impl<K, V, S> core::fmt::Debug for GdsfSegment<K, V, S> {
736    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
737        f.debug_struct("GdsfSegment")
738            .field("capacity", &self.config.capacity)
739            .field("len", &self.map.len())
740            .field("global_age", &self.global_age)
741            .finish()
742    }
743}
744
745/// An implementation of a Greedy Dual-Size Frequency (GDSF) cache.
746#[derive(Debug)]
747pub struct GdsfCache<K, V, S = DefaultHashBuilder> {
748    segment: GdsfSegment<K, V, S>,
749}
750
751impl<K: Hash + Eq, V: Clone, S: BuildHasher> GdsfCache<K, V, S> {
752    #[inline]
753    pub fn cap(&self) -> NonZeroUsize {
754        self.segment.cap()
755    }
756
757    #[inline]
758    pub fn len(&self) -> usize {
759        self.segment.len()
760    }
761
762    #[inline]
763    pub fn is_empty(&self) -> bool {
764        self.segment.is_empty()
765    }
766
767    /// Returns the current total size of cached content.
768    #[inline]
769    pub fn current_size(&self) -> u64 {
770        self.segment.current_size()
771    }
772
773    /// Returns the maximum content size the cache can hold.
774    #[inline]
775    pub fn max_size(&self) -> u64 {
776        self.segment.max_size()
777    }
778
779    #[inline]
780    pub fn global_age(&self) -> f64 {
781        self.segment.global_age()
782    }
783
784    #[inline]
785    pub fn record_miss(&mut self, object_size: u64) {
786        self.segment.record_miss(object_size);
787    }
788
789    #[inline]
790    pub fn get<Q>(&mut self, key: &Q) -> Option<V>
791    where
792        K: Borrow<Q> + Clone,
793        Q: ?Sized + Hash + Eq,
794    {
795        self.segment.get(key)
796    }
797
798    #[inline]
799    pub fn get_mut<Q>(&mut self, key: &Q) -> Option<&mut V>
800    where
801        K: Borrow<Q> + Clone,
802        Q: ?Sized + Hash + Eq,
803    {
804        self.segment.get_mut(key)
805    }
806
807    /// Inserts a key-value pair with explicit size into the cache.
808    ///
809    /// GDSF is inherently size-aware: the `size` parameter affects priority
810    /// calculation (`priority = global_age + frequency / size`), favoring
811    /// small, frequently-accessed items.
812    ///
813    /// # Arguments
814    ///
815    /// * `key` - The key to insert
816    /// * `val` - The value to insert
817    /// * `size` - Optional size in bytes. Use `SIZE_UNIT` (1) for count-based caching.
818    ///
819    /// # Returns
820    ///
821    /// - `Some(vec)` containing evicted entries (not replaced entries)
822    /// - `None` if no entries were evicted (zero allocation)
823    #[inline]
824    pub fn put(&mut self, key: K, val: V, size: u64) -> Option<Vec<(K, V)>>
825    where
826        K: Clone,
827    {
828        self.segment.put(key, val, size)
829    }
830
831    #[inline]
832    pub fn clear(&mut self) {
833        self.segment.clear()
834    }
835
836    /// Check if key exists without updating its priority or access metadata.
837    ///
838    /// Unlike `get()`, this method does NOT update the entry's frequency
839    /// or access metadata.
840    ///
841    /// # Example
842    ///
843    /// ```
844    /// use cache_rs::GdsfCache;
845    /// use cache_rs::config::GdsfCacheConfig;
846    /// use core::num::NonZeroUsize;
847    ///
848    /// let config = GdsfCacheConfig {
849    ///     capacity: NonZeroUsize::new(2).unwrap(),
850    ///     initial_age: 0.0,
851    ///     max_size: u64::MAX,
852    /// };
853    /// let mut cache = GdsfCache::init(config, None);
854    /// cache.put("a", 1, 10);
855    /// cache.put("b", 2, 10);
856    ///
857    /// // contains() does NOT update priority
858    /// assert!(cache.contains(&"a"));
859    /// ```
860    #[inline]
861    pub fn contains<Q>(&self, key: &Q) -> bool
862    where
863        K: Borrow<Q>,
864        Q: ?Sized + Hash + Eq,
865    {
866        self.segment.contains(key)
867    }
868
869    /// Returns a reference to the value without updating priority or access metadata.
870    ///
871    /// Unlike [`get()`](Self::get), this does NOT recalculate the entry's GDSF
872    /// priority or change its position in the priority lists.
873    ///
874    /// # Example
875    ///
876    /// ```
877    /// use cache_rs::GdsfCache;
878    /// use cache_rs::config::GdsfCacheConfig;
879    /// use core::num::NonZeroUsize;
880    ///
881    /// let config = GdsfCacheConfig {
882    ///     capacity: NonZeroUsize::new(3).unwrap(),
883    ///     initial_age: 0.0,
884    ///     max_size: u64::MAX,
885    /// };
886    /// let mut cache = GdsfCache::init(config, None);
887    /// cache.put("a", 1, 1);
888    ///
889    /// // peek does not change priority
890    /// assert_eq!(cache.peek(&"a"), Some(&1));
891    /// assert_eq!(cache.peek(&"missing"), None);
892    /// ```
893    #[inline]
894    pub fn peek<Q>(&self, key: &Q) -> Option<&V>
895    where
896        K: Borrow<Q>,
897        Q: ?Sized + Hash + Eq,
898    {
899        self.segment.peek(key)
900    }
901
902    /// Removes a key from the cache, returning the value if the key was present.
903    #[inline]
904    pub fn remove<Q>(&mut self, key: &Q) -> Option<V>
905    where
906        K: Borrow<Q>,
907        Q: ?Sized + Hash + Eq,
908    {
909        self.segment.remove(key)
910    }
911}
912
913impl<K: Hash + Eq, V: Clone, S: BuildHasher> CacheMetrics for GdsfCache<K, V, S> {
914    fn metrics(&self) -> BTreeMap<String, f64> {
915        self.segment.metrics().metrics()
916    }
917
918    fn algorithm_name(&self) -> &'static str {
919        self.segment.metrics().algorithm_name()
920    }
921}
922
923impl<K: Hash + Eq, V: Clone> GdsfCache<K, V, DefaultHashBuilder> {
924    /// Creates a new GDSF cache from a configuration.
925    ///
926    /// This is the **recommended** way to create a GDSF cache. All configuration
927    /// is specified through the [`GdsfCacheConfig`] struct.
928    ///
929    /// # Arguments
930    ///
931    /// * `config` - Configuration specifying capacity and optional size limit/initial age
932    /// * `hasher` - Optional custom hash builder. Pass `None` to use the default.
933    ///
934    /// # Example
935    ///
936    /// ```
937    /// use cache_rs::GdsfCache;
938    /// use cache_rs::config::GdsfCacheConfig;
939    /// use core::num::NonZeroUsize;
940    ///
941    /// // Simple capacity-only cache
942    /// let config = GdsfCacheConfig {
943    ///     capacity: NonZeroUsize::new(100).unwrap(),
944    ///     initial_age: 0.0,
945    ///     max_size: u64::MAX,
946    /// };
947    /// let mut cache: GdsfCache<&str, i32> = GdsfCache::init(config, None);
948    /// cache.put("key", 42, 1);
949    ///
950    /// // Cache with size limit (recommended for GDSF)
951    /// let config = GdsfCacheConfig {
952    ///     capacity: NonZeroUsize::new(1000).unwrap(),
953    ///     initial_age: 0.0,
954    ///     max_size: 10 * 1024 * 1024,  // 10MB
955    /// };
956    /// let cache: GdsfCache<String, Vec<u8>> = GdsfCache::init(config, None);
957    /// ```
958    pub fn init(config: GdsfCacheConfig, hasher: Option<DefaultHashBuilder>) -> Self {
959        GdsfCache {
960            segment: GdsfSegment::init(config, hasher.unwrap_or_default()),
961        }
962    }
963}
964
965#[cfg(test)]
966mod tests {
967    use super::*;
968    use crate::config::GdsfCacheConfig;
969    use core::num::NonZeroUsize;
970
971    /// Helper to create a GdsfCache with the given capacity
972    fn make_cache<K: Hash + Eq + Clone, V: Clone>(cap: usize) -> GdsfCache<K, V> {
973        let config = GdsfCacheConfig {
974            capacity: NonZeroUsize::new(cap).unwrap(),
975            initial_age: 0.0,
976            max_size: u64::MAX,
977        };
978        GdsfCache::init(config, None)
979    }
980
981    #[test]
982    fn test_gdsf_basic_operations() {
983        let mut cache = make_cache(3);
984
985        assert_eq!(cache.put("a", 1, 1), None);
986        assert_eq!(cache.put("b", 2, 2), None);
987        assert_eq!(cache.put("c", 3, 1), None);
988        assert_eq!(cache.len(), 3);
989
990        assert_eq!(cache.get(&"a"), Some(1));
991        assert_eq!(cache.get(&"b"), Some(2));
992        assert_eq!(cache.get(&"c"), Some(3));
993
994        assert!(cache.contains(&"a"));
995        assert!(!cache.contains(&"d"));
996    }
997
998    #[test]
999    fn test_gdsf_frequency_priority() {
1000        let mut cache = make_cache(2);
1001
1002        cache.put("a", 1, 1);
1003        cache.put("b", 2, 1);
1004
1005        cache.get(&"a");
1006        cache.get(&"a");
1007
1008        cache.put("c", 3, 1);
1009
1010        assert!(cache.contains(&"a"));
1011        assert!(!cache.contains(&"b"));
1012        assert!(cache.contains(&"c"));
1013    }
1014
1015    #[test]
1016    fn test_gdsf_size_consideration() {
1017        let mut cache = make_cache(2);
1018
1019        cache.put("small", 1, 1);
1020        cache.put("large", 2, 10);
1021
1022        cache.put("medium", 3, 5);
1023
1024        assert!(cache.contains(&"small"));
1025        assert!(!cache.contains(&"large"));
1026        assert!(cache.contains(&"medium"));
1027    }
1028
1029    #[test]
1030    fn test_gdsf_update_existing() {
1031        let mut cache = make_cache(2);
1032
1033        cache.put("key", 1, 1);
1034        assert_eq!(cache.get(&"key"), Some(1));
1035
1036        // Replacement returns None (not eviction)
1037        assert!(cache.put("key", 2, 2).is_none());
1038        assert_eq!(cache.get(&"key"), Some(2));
1039        assert_eq!(cache.len(), 1);
1040    }
1041
1042    #[test]
1043    fn test_gdsf_zero_size_rejection() {
1044        let mut cache = make_cache(2);
1045
1046        assert_eq!(cache.put("key", 1, 0), None);
1047        assert_eq!(cache.len(), 0);
1048        assert!(!cache.contains(&"key"));
1049    }
1050
1051    #[test]
1052    fn test_gdsf_remove_by_key_legacy() {
1053        let mut cache = make_cache(2);
1054
1055        cache.put("a", 1, 1);
1056        cache.put("b", 2, 1);
1057
1058        assert_eq!(cache.remove(&"a"), Some(1));
1059        assert_eq!(cache.len(), 1);
1060        assert!(!cache.contains(&"a"));
1061        assert!(cache.contains(&"b"));
1062
1063        assert_eq!(cache.remove(&"nonexistent"), None);
1064    }
1065
1066    #[test]
1067    fn test_gdsf_clear() {
1068        let mut cache = make_cache(2);
1069
1070        cache.put("a", 1, 1);
1071        cache.put("b", 2, 1);
1072        assert_eq!(cache.len(), 2);
1073
1074        cache.clear();
1075        assert_eq!(cache.len(), 0);
1076        assert!(cache.is_empty());
1077        assert!(!cache.contains(&"a"));
1078        assert!(!cache.contains(&"b"));
1079    }
1080
1081    #[test]
1082    fn test_gdsf_global_aging() {
1083        let mut cache = make_cache(2);
1084
1085        cache.put("a", 1, 1);
1086        cache.put("b", 2, 1);
1087
1088        let initial_age = cache.global_age();
1089
1090        cache.put("c", 3, 1);
1091
1092        assert!(cache.global_age() > initial_age);
1093    }
1094
1095    #[test]
1096    fn test_miri_stacked_borrows_fix() {
1097        let mut cache = make_cache(10);
1098
1099        cache.put("a", 1, 10);
1100        cache.put("b", 2, 20);
1101        cache.put("c", 3, 15);
1102
1103        for _ in 0..3 {
1104            assert_eq!(cache.get(&"a"), Some(1));
1105            assert_eq!(cache.get(&"b"), Some(2));
1106            assert_eq!(cache.get(&"c"), Some(3));
1107        }
1108
1109        assert_eq!(cache.len(), 3);
1110
1111        if let Some(val) = cache.get_mut(&"a") {
1112            *val += 10;
1113        }
1114        assert_eq!(cache.get(&"a"), Some(11));
1115    }
1116
1117    #[test]
1118    fn test_gdsf_segment_directly() {
1119        let config = GdsfCacheConfig {
1120            capacity: NonZeroUsize::new(2).unwrap(),
1121            initial_age: 0.0,
1122            max_size: u64::MAX,
1123        };
1124        let mut segment: GdsfSegment<&str, i32, DefaultHashBuilder> =
1125            GdsfSegment::init(config, DefaultHashBuilder::default());
1126        assert_eq!(segment.len(), 0);
1127        assert!(segment.is_empty());
1128        assert_eq!(segment.cap().get(), 2);
1129        segment.put("a", 1, 1);
1130        segment.put("b", 2, 2);
1131        assert_eq!(segment.len(), 2);
1132        assert_eq!(segment.get(&"a"), Some(1));
1133        assert_eq!(segment.get(&"b"), Some(2));
1134    }
1135
1136    #[test]
1137    fn test_gdsf_concurrent_access() {
1138        extern crate std;
1139        use std::sync::{Arc, Mutex};
1140        use std::thread;
1141        use std::vec::Vec;
1142
1143        let cache = Arc::new(Mutex::new(make_cache::<String, i32>(100)));
1144        let num_threads = 4;
1145        let ops_per_thread = 100;
1146
1147        let mut handles: Vec<std::thread::JoinHandle<()>> = Vec::new();
1148
1149        for t in 0..num_threads {
1150            let cache = Arc::clone(&cache);
1151            handles.push(thread::spawn(move || {
1152                for i in 0..ops_per_thread {
1153                    let key = std::format!("key_{}_{}", t, i);
1154                    let size = ((i % 10) + 1) as u64; // Varying sizes 1-10
1155                    let mut guard = cache.lock().unwrap();
1156                    guard.put(key.clone(), i, size);
1157                    let _ = guard.get(&key);
1158                }
1159            }));
1160        }
1161
1162        for handle in handles {
1163            handle.join().unwrap();
1164        }
1165
1166        let mut guard = cache.lock().unwrap();
1167        assert!(guard.len() <= 100);
1168        guard.clear(); // Clean up for MIRI
1169    }
1170
1171    #[test]
1172    fn test_gdsf_size_aware_tracking() {
1173        let mut cache = make_cache(10);
1174
1175        assert_eq!(cache.current_size(), 0);
1176        assert_eq!(cache.max_size(), u64::MAX);
1177
1178        // GDSF already requires size in put()
1179        cache.put("a", 1, 100);
1180        cache.put("b", 2, 200);
1181        cache.put("c", 3, 150);
1182
1183        assert_eq!(cache.current_size(), 450);
1184        assert_eq!(cache.len(), 3);
1185
1186        // GDSF doesn't have remove method, test clear instead
1187        cache.clear();
1188        assert_eq!(cache.current_size(), 0);
1189    }
1190
1191    #[test]
1192    fn test_gdsf_init_constructor() {
1193        let config = GdsfCacheConfig {
1194            capacity: NonZeroUsize::new(1000).unwrap(),
1195            initial_age: 0.0,
1196            max_size: 1024 * 1024,
1197        };
1198        let cache: GdsfCache<String, i32> = GdsfCache::init(config, None);
1199
1200        assert_eq!(cache.current_size(), 0);
1201        assert_eq!(cache.max_size(), 1024 * 1024);
1202    }
1203
1204    #[test]
1205    fn test_gdsf_with_limits_constructor() {
1206        let config = GdsfCacheConfig {
1207            capacity: NonZeroUsize::new(100).unwrap(),
1208            initial_age: 0.0,
1209            max_size: 1024 * 1024,
1210        };
1211        let cache: GdsfCache<String, String> = GdsfCache::init(config, None);
1212
1213        assert_eq!(cache.current_size(), 0);
1214        assert_eq!(cache.max_size(), 1024 * 1024);
1215        assert_eq!(cache.cap().get(), 100);
1216    }
1217
1218    #[test]
1219    fn test_gdsf_contains_non_promoting() {
1220        let mut cache = make_cache(2);
1221        cache.put("a", 1, 10);
1222        cache.put("b", 2, 10);
1223
1224        // contains() should return true for existing keys
1225        assert!(cache.contains(&"a"));
1226        assert!(cache.contains(&"b"));
1227        assert!(!cache.contains(&"c"));
1228
1229        // Access "b" to increase its priority
1230        cache.get(&"b");
1231
1232        // contains() should NOT increase priority of "a"
1233        assert!(cache.contains(&"a"));
1234
1235        // Adding "c" should evict "a" (lowest priority), not "b"
1236        cache.put("c", 3, 10);
1237        assert!(!cache.contains(&"a")); // "a" was evicted
1238        assert!(cache.contains(&"b")); // "b" still exists
1239        assert!(cache.contains(&"c")); // "c" was just added
1240    }
1241
1242    #[test]
1243    fn test_put_returns_none_when_no_eviction() {
1244        let mut cache = make_cache(10);
1245        assert!(cache.put("a", 1, 10).is_none());
1246        assert!(cache.put("b", 2, 10).is_none());
1247    }
1248
1249    #[test]
1250    fn test_put_returns_single_eviction() {
1251        let mut cache = make_cache(2);
1252        cache.put("a", 1, 10);
1253        cache.put("b", 2, 10);
1254        let result = cache.put("c", 3, 10);
1255        assert!(result.is_some());
1256        let evicted = result.unwrap();
1257        assert_eq!(evicted.len(), 1);
1258    }
1259
1260    #[test]
1261    fn test_put_replacement_returns_none() {
1262        let mut cache = make_cache(10);
1263        cache.put("a", 1, 10);
1264        // Replacement is not eviction - returns None
1265        let result = cache.put("a", 2, 10);
1266        assert!(result.is_none());
1267        // Value should be updated
1268        assert_eq!(cache.get(&"a"), Some(2));
1269    }
1270
1271    #[test]
1272    fn test_put_returns_multiple_evictions_size_based() {
1273        let config = GdsfCacheConfig {
1274            capacity: NonZeroUsize::new(10).unwrap(),
1275            initial_age: 0.0,
1276            max_size: 100,
1277        };
1278        let mut cache = GdsfCache::init(config, None);
1279        for i in 0..10 {
1280            cache.put(i, i, 10);
1281        }
1282        let result = cache.put(99, 99, 50);
1283        let evicted = result.unwrap();
1284        assert_eq!(evicted.len(), 5);
1285    }
1286
1287    #[test]
1288    fn test_gdsf_remove_by_key() {
1289        let mut cache = make_cache(2);
1290        cache.put("a", 1, 10);
1291        cache.put("b", 2, 10);
1292
1293        // remove() works
1294        assert_eq!(cache.remove(&"a"), Some(1));
1295        assert_eq!(cache.len(), 1);
1296        assert!(!cache.contains(&"a"));
1297        assert!(cache.contains(&"b"));
1298
1299        assert_eq!(cache.remove(&"nonexistent"), None);
1300    }
1301}