Expand description
In memory Key/Value store with LRU expire and concurrent access
Description
Items are stored in N sharded/bucketized HashMaps to improve concurrency. Every Item is always behind a RwLock. Quering an item will return a guard associated to this lock. Items that are not locked are kept in a list to implement a least-recent-used expire policy. Locked items are removed from that lru list and put into the lru-list when they become unlocked. Locked Items will not block the hosting HashMap.
LRU List and expire configuration
Items that are not in use are pushed onto the tail of an least-recently-used list. Whenever a CacheDb decides to expire Items these are taken from the head of the lru-list and dropped.
There are some configuration options to tune the caching behavior. These configurations are all per-bucket and not global. Thus one has to take the number of buckets into account when configuring the cachedb. When exact accounting (for example for ‘highwater’) is needed one must use a single bucket cachedb.
Its is also important to know that min/max capacity configurations account against the capacity of the underlying containers, not the used number of entries. This allows for better memory utilization after a container got resized but should be respected in a way that caching won’t make the containers grow overly large.
The default configuration is choosen to create an rather generic cache that may grow
pretty huge and being conservative with expiring entries. By this it should be useable as
is for many use cases. If required the max_capacity_limit
or highwater
are the first
knobs to tune.
Implementation Discussion
The HashMap storing the Items in Boxed entries. Entries protect the actual item by a RwLock. The API allows access to items only over these locks, returning wraped guards thereof. Since concurrent access to the Entries shall not block the Hashmap, some ‘unsafe’ code is required to implement hand over hand locking which is hidden behind an safe API.
New Items are constructed in an atomic way by passing a closure producing the item to the respective lookup function. While an Item is constructed it has a write lock which ensures that on concurrent construction/queries only one contructor wins and any other will acquire the newly constructed item.
Proof that no lifetime guarantees are violated
Is actually simple, the returned guard has a rust lifetime bound to the CacheDB object. Thus no access can outlive the hosting collection.
Proof that no data races exist
In most parts the Mutex and RwLock ensures that no data races can happen, this is validated by rust.
The unsafe part of the implementation detaches a LockGuard from its hosting collection to free the mutex on the HashMap. This could lead to potential UB when the HashMap drops a value that is still in use/locked. However this can never be happen because there is no way to drop Entries in a uncontrolled way. The guard lifetimes are tied to the hosting hashmap the can not outlive it. Dropping items from the hash map only done from the LRU list which will never contain locked (and thus in-use) Entries. The ‘remove(key)’ member function checks explicitly that an Entry is not in use or delays the removal until all locks on the Item are released.
While the HashMap may reallocate the tables and thus move the Boxes containing the Entries around, this is not a problem since the lock guards contain references to Entries directly, not to the outer Box.
Proof that locking is deadlock free
Locks acquired in the same order can never deadlock. Deadlocks happen only when 2 or more threads wait on a resource while already holding resource another theread is trying to obtain.
On lookup the hashmap will be locked. When the element is found the LRU list is locked and the element may be removed from it (when it was not in use). Once done with the LRU list its lock is released.
It is worth to mention that code using the cachedb can still deadlock when it acquires locks in ill order. The simplest advise is to have only one single exclusive lock at all time per thread. When is impractical one need to carefully consider locking order or employ other tactics to counter deadlocks.
TESTS
The ‘test::multithreaded_stress’ test can be controlled by environment variables
- ‘STRESS_THREADS’ sets the number of threads to spawn. Defaults to 10.
- ‘STRESS_WAIT’ threads randomly wait up to this much milliseconds to fake some work. Defaults to 5.
- ‘STRESS_ITERATIONS’ how many iterations each thread shall do. Defaults to 100.
- ‘STRESS_RANGE’ how many unique keys the test uses. Defaults to 1000.
The default values are rather small to make the test suite complete in short time. For dedicated stress testing at least STRESS_ITERATIONS and STRESS_THREADS has to be incresed significantly. Try ‘STRESS_ITERATIONS=10000 STRESS_RANGE=10000 STRESS_THREADS=10000’ for some harder test.
Structs
Duration
type to represent a span of time, typically used for system
timeouts.