Crate moka

source · []
Expand description

Moka is a fast, concurrent cache library for Rust. Moka is inspired by the Caffeine library for Java.

Moka provides in-memory concurrent cache implementations on top of hash maps. They support full concurrency of retrievals and a high expected concurrency for updates. They utilize a lock-free concurrent hash table as the central key-value storage.

Moka also provides an in-memory, non-thread-safe cache implementation for single thread applications.

All cache implementations perform a best-effort bounding of the map using an entry replacement algorithm to determine which entries to evict when the capacity is exceeded.


  • Thread-safe, highly concurrent in-memory cache implementations:
    • Synchronous caches that can be shared across OS threads.
    • An asynchronous (futures aware) cache that can be accessed inside and outside of asynchronous contexts.
  • A cache can be bounded by one of the followings:
    • The maximum number of entries.
    • The total weighted size of entries. (Size aware eviction)
  • Maintains good hit rate by using entry replacement algorithms inspired by Caffeine:
    • Admission to a cache is controlled by the Least Frequently Used (LFU) policy.
    • Eviction from a cache is controlled by the Least Recently Used (LRU) policy.
  • Supports expiration policies:
    • Time to live
    • Time to idle


See the following document:

Minimum Supported Rust Versions

This crate’s minimum supported Rust versions (MSRV) are the followings:

FeatureEnabled by default?MSRV
no featureRust 1.51.0
atomic64yesRust 1.51.0
quantayesRust 1.51.0
futureRust 1.51.0
dashRust 1.51.0

If only the default features are enabled, MSRV will be updated conservatively. When using other features, like future, MSRV might be updated more frequently, up to the latest stable. In both cases, increasing MSRV is not considered a semver-breaking change.

Implementation Details


In a concurrent cache (sync or future cache), the entry replacement algorithms are kept eventually consistent with the map. While updates to the cache are immediately applied to the map, recording of reads and writes may not be immediately reflected on the cache policy’s data structures.

These structures are guarded by a lock and operations are applied in batches to avoid lock contention. There are bounded inter-thread channels to hold these operations. These channels are drained at the first opportunity when:

  • The numbers of read/write recordings reach to the configured amounts.
  • Or, the certain time past from the last draining.

In a Cache, this draining and batch application is handled by a single worker thread. So under heavy concurrent operations from clients, draining may not be able to catch up and the bounded channels can become full.

When read or write channel becomes full, one of the followings will occur:

  • For the read channel, recordings of new reads will be discarded, so that retrievals will never be blocked. This behavior may have some impact to the hit rate of the cache.
  • For the write channel, updates from clients to the cache will be blocked until the draining task catches up.

Cache does its best to avoid blocking updates by adjusting the interval of draining. But since it has only one worker thread, it cannot always avoid blocking. If this happens very often in your cache (in the future, you can check the statistics of the cache), you may want to switch to SegmentedCache. It has multiple internal cache segments and each segment has dedicated draining thread.

Admission and Eviction

Every time a client tries to retrieve an item from the cache, that activity is retained in a historic popularity estimator. This estimator has a tiny memory footprint as it uses hashing to probabilistically estimate an item’s frequency.

All caches employ TinyLFU (Least Frequently Used) as the admission policy. When a new entry is inserted to the cache, it is temporary admitted to the cache, and a recording of this insertion is added to the write queue. When the write queue is drained and the main space of the cache is already full, then the historic popularity estimator determines to evict one of the following entries:

  • The temporary admitted entry.
  • Or, an entry that is selected from the main cache space by LRU (Least Recently Used) eviction policy.

In a future release of this crate, TinyLFU admission policy will be replaced by Window TinyLFU (W-TinyLFU) policy. W-TinyLFU has an admission window in front of the main space. A new entry starts in the admission window and remains there as long as it has high temporal locality (recency). Eventually an entry will slip off from the window, then TinyLFU comes in play to determine whether or not to admit the entry to the main space based on its popularity (frequency).


Current release supports the following cache expiration policies:

  • The time-to-live policy
  • The time-to-idle policy

A future release will support the following:

  • The variable expiration (which allows to set different expiration on each cached entry)

These policies are provided with O(1) time complexity:

  • The time-to-live policy uses a write-order queue.
  • The time-to-idle policy uses an access-order queue.
  • The variable expiration will use a hierarchical timer wheel (*1).

*1: If you get 404 page not found when you click on the link to the hierarchical timer wheel paper, try to change the URL from https: to http:.



Experimental: Provides a thread-safe, concurrent cache implementation built upon dashmap::DashMap.


Provides a thread-safe, concurrent asynchronous (futures aware) cache implementation.

Provides thread-safe, concurrent cache implementations.

Provides a not thread-safe cache implementation built upon std::collections::HashMap.


The policy of a cache.


The error type for the functionalities around Cache#invalidate_entries_if method.