coordinode-lsm-tree 4.0.0

A K.I.S.S. implementation of log-structured merge trees (LSM-trees/LSMTs) — CoordiNode fork
Documentation

CI Upstream CI docs.rs Crates.io MSRV

Maintained fork by Structured World Foundation for the CoordiNode database engine. Based on fjall-rs/lsm-tree. We contribute patches upstream and maintain additional features needed for CoordiNode (zstd compression, custom sequence number generators, batch get, intra-L0 compaction, security hardening).

[!IMPORTANT] This fork now introduces a fork-specific disk format V4 compatibility boundary. V4 is a breaking on-disk change relative to V3 because the fork persists new semantics such as range tombstones and merge operands. New code may continue reading supported V3 databases, but databases written with these V4 semantics must not be opened by older V3 binaries.

A K.I.S.S. implementation of log-structured merge trees (LSM-trees/LSMTs) in Rust.

[!NOTE] This crate only provides a primitive LSM-tree, not a full storage engine. For example, it does not ship with a write-ahead log. You probably want to use https://github.com/fjall-rs/fjall instead.

About

This is the most feature-rich LSM-tree implementation in Rust! It features:

  • Thread-safe BTreeMap-like API
  • Mostly safe & 100% stable Rust
  • Block-based tables with compression support & prefix truncation
    • Optional block hash indexes in data blocks for faster point lookups [3]
    • Per-level filter/index block pinning configuration
  • Range & prefix searching with forward and reverse iteration
  • Block caching to keep hot data in memory
  • File descriptor caching with upper bound to reduce fopen syscalls
  • AMQ filters (currently Bloom filters) to improve point lookup performance
  • Multi-versioning of KVs, enabling snapshot reads
  • Optionally partitioned block index & filters for better cache efficiency [1]
  • Leveled and FIFO compaction
  • Optional key-value separation for large value workloads [2], with automatic garbage collection
  • Single deletion tombstones ("weak" deletion)
  • Optional compaction filters to run custom logic during compactions

Keys are limited to 65536 bytes, values are limited to 2^32 bytes. As is normal with any kind of storage engine, larger keys and values have a bigger performance impact.

Sponsors

Feature flags

lz4

Allows using LZ4 compression, powered by lz4_flex.

Disabled by default.

bytes

Uses bytes as the underlying Slice type.

Disabled by default.

Run unit benchmarks

cargo bench --features lz4

Support the Project

USDT TRC-20 Donation QR Code

USDT (TRC-20): TFDsezHa1cBkoeZT5q2T49Wp66K8t2DmdA

License

All source code is licensed under MIT OR Apache-2.0.

All contributions are to be licensed as MIT OR Apache-2.0.

Original project by fjall-rs. This fork is maintained by Structured World Foundation.

Footnotes

[1] https://rocksdb.org/blog/2017/05/12/partitioned-index-filter.html

[2] https://github.com/facebook/rocksdb/wiki/BlobDB

[3] https://rocksdb.org/blog/2018/08/23/data-block-hash-index.html