wb_cache/
lib.rs

1//! # wb-cache
2//!
3//! Generic write-behind caching solution for key-indexed record-based storages.
4//!
5//! Think of it as an L1 cache for backend storage.
6//!
7//! # What's In It For Me?
8//!
9//! The table is, perhaps, the best answer:
10//!
11//! | DB | Plain, sec | Cached, sec | Ratio |
12//! | -- | ---------- | ----------- | ----- |
13//! | PostgreSQL | 27073.36 | 277.95 | **97.44x** |
14//! | SQLite | 137.93 | 14.35 | **9.61x** |
15//!
16//! The benchmarking is implemented by the [`test::simulation`] module. It simulates an e-commerce startup using a
17//! semi-realistic model. The outcomes are expected to vary under different conditions. Here is what was used for these
18//! two:
19//!
20//! - PostgreSQL 17
21//!   * QNAP TVS-h874X, 12th Gen Intel® Core™ i9-12900E, 64GB RAM, Docker container on Samsung SSD 990 EVO 2TB
22//!   * Local network latency: 3ms on average
23//! - SQLite
24//!   * Apple Mac Studio M2 Ultra, 128GB RAM, SSD AP2048Z
25//!
26//! # The Basics
27//!
28//! The `wb-cache` crate is designed for the following use case:
29//!
30//! - Key-indexed, record-based storage; e.g., database tables, NoSQL databases, and even external in-memory caches.
31//! - Unsatisfactory latency in data exchange operations.
32//! - Batching write operations to storage is beneficial for performance.
33//!
34//! The cache operates on the following principles:
35//!
36//! - It is backend-agnostic.
37//! - It is key and value agnostic.
38//! - Keys are classified as "primary" and "secondary," and each record can have none, one, or many secondary keys.
39//! - Implemented as a controller over the [moka](https://crates.io/crates/moka) cache.
40//! - Fully async.
41//! - As an "L1" cache, it doesn't support distributed caching.
42//!
43//! The cache's principal model is as follows:
44//!
45//! ![](https://raw.githubusercontent.com/vrurg/wb-cache/bbec97b39af581a4b31ad44e2a90ea3e010a5a45/docs/operations.svg)
46//!
47//! For simplicity, the cache controller is often referred to simply as "cache," unless it is necessary
48//! to distinguish it from the inner cache object.
49//!
50//! # Data Records And Keys
51//!
52//! The cache controller operates on data records identified by keys. There must be one primary key and zero or more
53//! secondary keys. The inner cache object may contain multiple entries for the same data record: one entry is the
54//! primary key entry, which holds the record itself, while the secondary entries serve as references to the primary key
55//! entry.
56//!
57//! The above fact means that cache maximum capacity doesn't necessarily mean the maximum number of records that can be
58//! held in the cache.
59//!
60//! # Cache Controller Components
61//!
62//! ![](https://raw.githubusercontent.com/vrurg/wb-cache/8164e6cbc9da18d95f20b07c1fe1e5e147eecaf4/docs/cache_controller.svg)
63//!
64//! The diagram above illustrates the main components of the cache controller. Here is about the purpose of each of them in few words:
65//!
66//! - Data Controller – see the section below for more details.
67//! - Updates Pool – the data controller produces update records that are stored in the pool until they are flushed to the backend.
68//! - Cache – the actual [moka](https://crates.io/crates/moka) cache object.
69//! - Monitoring Task – a background task that monitors the cache and auto-flushes updates to the backend.
70//! - Observers – a list of user-registered objects that are listening to cache events.
71//!
72//! ## The Cache Object
73//!
74//! There is not much to say about the cache object itself. The only two parameters of the cache controller that are
75//! used to configure the inner cache is the maximum capacity and cache name. The object is set to use Tiny LFU eviction
76//! policy.
77//!
78//! ## The Updates Pool
79//!
80//! The cache controller is totally agnostic about the content of the updates pool as its purpose is to store records
81//! produced by the data controller. The pool is always indexed by the primary key of the record meaning that the
82//! maximum pool capacity is the actual maximum number of record updates that can be stored in the cache.
83//!
84//! The pool helps to minimize the number of individual writes to the backend by batching them together.
85//!
86//! ## The Monitor Task
87//!
88//! The monitor task runs in the background and monitors the cache state. If the maximum update pool capacity is
89//! reached, if the flush timeout expires, or if the cache is being shut down, the task initiates a batch flush
90//! operation.
91//!
92//! The task is defined by two timing parameters: `monitor_tick_duration` and `flush_interval`. The first specifies the
93//! time interval between consecutive checks of the cache state. The second specifies the time after which the task will
94//! initiate a flush operation if there is at least one update in the pool. Setting `flush_interval` to `Duration::MAX`
95//! effectively disables the timed flush operation, meaning that it will only be triggered by either exceeding the maximum
96//! update pool capacity or by the cache shutdown.
97//!
98//! Note that a manually initiated batch flush resets the flush timeout.
99//!
100//! ## Observers
101//!
102//! User-registered objects that implement [`Observer`] trait and listening to cache events.
103//!
104//! # Data Controller
105//!
106//! The diagram reflects an important design decision: the primary component of the cache is not the cache itself, but
107//! the data controller. This documentation will delve into the technical details of the controller later; for now, it
108//! is important to understand that the controller is not only a layer between the backend and the cache but also the
109//! type that, through [`DataController`] trait-associated types, defines the following types: `Key`, `Value`,
110//! `CacheUpdate`, and `Error`.
111//!
112//! The data controller is also responsible for producing update records that can later be used for efficient backend
113//! updates. For example, the simulation included with this crate uses SeaORM's ActiveModel to detect differences
114//! between a record's previous and updated content. These differences are merged with the existing update state to
115//! create a final update record, which will eventually be submitted to the underlying database table — unless the
116//! original record has been deleted from the cache, in which case the update record is simply discarded.
117//!
118//! There is one more role that the data controller plays: it tells the cache what to do with a new or updated
119//! data record received from the user. It can be ignored, inserted into the cache, revoked, or revoked with its
120//! corresponding update record dropped as well, which means a full delete. The simulation uses this feature to
121//! optimize caching by immediately inserting records when it is known that their content will not change after
122//! being sent to the backend. This allows the cache to avoid unnecessary roundtrips to the backend.[^all_in_mem]
123//!
124//! [^all_in_mem]: In the most extreme case, there is a chance that a newly inserted record may never be written to the
125//! backend. This can occur when the record is inserted, then possibly updated, and subsequently deleted quickly enough
126//! to avoid auto-flushing.
127//!
128//! Here is two diagrams illustrating the data flow between the cache and the data controller. The first one is for the
129//! most simple case of a new record being inserted into the cache:
130//!
131//! <a id="on_new"></a>
132//! ![](https://raw.githubusercontent.com/vrurg/wb-cache/bbec97b39af581a4b31ad44e2a90ea3e010a5a45/docs/cache_dc_interact_new.svg)
133//!
134//! Below is the second diagram illustrating a more complex case of processing an update. Note that the diagram assumes that
135//! the record requested by the user is already present in the cache. If it is not, the cache would ask the data
136//! controller to fetch it from the backend, which is omitted here for simplicity. The scenario where the record is not
137//! found in the backend essentially reduces to the case of a new record, so it is not shown here.
138//!
139//! ![](https://raw.githubusercontent.com/vrurg/wb-cache/bbec97b39af581a4b31ad44e2a90ea3e010a5a45/docs/cache_dc_interact_update.svg)
140//!
141//! # How Do I Use It?
142//!
143//! Implement a data controller that conforms to the [`DataController`] trait. Refer to the trait's documentation for
144//! complete details on the expected behavior.
145//!
146//! _The [`test::simulation`] module is provided as an example of using caching in a semi-realistic environment. In
147//! particular, the [`test::simulation::db::entity`] module includes SeaORM-based examples that can guide your
148//! implementation._
149//!
150//! Finally, pass an instance of your controller to the [`Cache`
151//! builder](cache::CacheBuilder). For instance, if your controller type is `MyDataController`, you can integrate it
152//! like this:
153//!
154//! ```ignore
155//! use wb_cache::{Cache, prelude::*};
156//! use my_crate::MyDataController;
157//!
158//! let cache = Cache::builder()
159//!     .data_controller(MyDataController)
160//!     // ... more parameters if needed ...
161//!     .build();
162//! ```
163//!
164//! Sometimes one may consider using the (observers)[#observers] in their application code.  This is primarily a
165//! front-end rather than a back-end feature, although that is not a strict rule.
166pub mod cache;
167pub mod entry;
168pub mod entry_selector;
169pub mod test;
170pub mod traits;
171pub mod types;
172pub mod update_iterator;
173pub(crate) mod update_state;
174
175#[doc(inline)]
176pub use cache::Cache;
177#[doc(inline)]
178pub use traits::DataController;
179#[doc(inline)]
180pub use traits::Observer;
181
182pub mod prelude {
183    pub use crate::cache::Cache;
184    pub use crate::entry::Entry;
185    pub use crate::traits::DataController;
186    pub use crate::types::*;
187    pub use moka::ops::compute::Op;
188}
189
190#[macro_export]
191macro_rules! wbdc_response {
192    ($op:expr, $update:expr) => {
193        $crate::types::DataControllerResponse {
194            op:     $op,
195            update: $update,
196        }
197    };
198}