Expand description
A K.I.S.S. implementation of log-structured merge trees (LSM-trees/LSMTs).
This crate exports a persistent Tree
that supports a subset of the BTreeMap
API.
LSM-trees are an alternative to B-trees to persist a sorted list of items (e.g. a database table)
on disk and perform fast lookup queries.
Instead of updating a disk-based data structure in-place,
deltas (inserts and deletes) are added into an in-memory write buffer (MemTable
).
To provide durability, the MemTable
is backed by a disk-based journal.
When the MemTable
exceeds a size threshold, it is flushed to a sorted file
on disk (“Segment” a.k.a. “SSTable
”).
Items can then be retrieved by checking and merging results from the MemTable
and Segments.
Amassing many segments on disk will degrade read performance and waste disk space usage, so segments are periodically merged into larger segments in a process called “Compaction”. Different compaction strategies have different advantages and drawbacks, depending on your use case.
Because maintaining an efficient structure is deferred to the compaction process, writing to an LSMT is very fast (O(1) complexity).
Example usage
use lsm_tree::{Tree, Config};
// A tree is a single physical keyspace/index/...
// and supports a BTreeMap-like API
// but all data is persisted to disk.
let tree = Config::new(folder).open()?;
// Note compared to the BTreeMap API, operations return a Result<T>
// So you can handle I/O errors if they occur
tree.insert("my_key", "my_value")?;
let item = tree.get("my_key")?;
assert_eq!(Some("my_value".as_bytes().into()), item);
// Flush to definitely make sure data is persisted
tree.flush()?;
// Search by prefix
for item in &tree.prefix("prefix") {
// ...
}
// Search by range
for item in &tree.range("a"..="z") {
// ...
}
// Iterators implement DoubleEndedIterator, so you can search backwards, too!
for item in tree.prefix("prefix").into_iter().rev() {
// ...
}
Modules
- Contains compaction strategies
Structs
- An atomic write batch
- Block cache, in which blocks are cached in-memory after being retrieved from disk. This speeds up consecutive queries to nearby data, improving read performance for hot data.
- Tree configuration
- A snapshot captures a read-only point-in-time view of the tree at the time the snapshot was created.
- A log-structured merge tree (LSM-tree/LSMT)
Enums
- Error during deserialization
- Represents a single entry in the tree, which may or may not exist.
- Represents errors that can occur in the LSM-tree
- Errors that can occur during journal recovery
- Error during serialization
Type Aliases
- Tree result