RoughDB
Embedded key-value storage written in Rust — a port of LevelDB to Rust.
RoughDB is an LSM-tree key-value store with a LevelDB-compatible on-disk format. It supports persistent (WAL + MANIFEST + SSTables) and in-memory operation, multi-level compaction, and bidirectional iteration.
[!WARNING] While I started this all "manually", many years ago... This is an experiment of sort. While I'm relatively serious about this, I also used it more recently to experiment with different things, including generative AI. I'm reviewing all changes manually and carefully.
Getting started
Opening a database
use ;
let mut opts = default;
opts.create_if_missing = true;
let db = open?;
Use Db::default() for a lightweight in-memory database (no WAL, no flush):
let db = default;
Basic reads and writes
use ;
// Write
db.put?;
db.delete?;
// Read
match db.get
Atomic batch writes
use ;
let mut batch = new;
batch.put;
batch.put;
batch.delete;
db.write?;
Iterating over keys
Iterators start unpositioned — call seek_to_first, seek_to_last, or
seek before reading.
use ReadOptions;
let mut it = db.new_iterator?;
// Forward
it.seek_to_first;
while it.valid
// Backward
it.seek_to_last;
while it.valid
// Range scan from "b" onward
it.seek;
while it.valid
Snapshots
A snapshot pins the database to a specific point in time; reads through it see only writes that preceded the snapshot.
use ReadOptions;
let snap = db.get_snapshot;
// Reads with this snapshot see the state at the moment it was taken.
let opts = ReadOptions ;
let value = db.get_with_options?;
let mut it = db.new_iterator?;
// snap releases automatically when it goes out of scope.
Manual compaction
// Compact everything
db.compact_range?;
// Compact a specific key range
db.compact_range?;
Building and testing
Status
RoughDB is in active development. The on-disk format is LevelDB-compatible.
Implemented:
- WAL + MANIFEST + SSTable flush and crash-safe recovery
- L0→L1 compaction with snapshot-aware pruning and tombstone elision; manual
compact_range - Bidirectional iteration (
seek_to_first,seek_to_last,seek,next,prev) - Snapshots (
get_snapshot/release_snapshot) - Bloom filter support (
Options::filter_policy) - Block compression: Snappy (default) and Zstd (
Options::compression) get_property—leveldb.num-files-at-level<N>,leveldb.stats,leveldb.sstables,leveldb.approximate-memory-usageget_approximate_sizes— byte-range estimation via index-block seeksdestroy— safely removes a database directoryLOCKfile — prevents concurrent opens by multiple processes- Table cache — LRU open-file-handle cache bounded by
Options::max_open_files - Block cache — LRU byte-capacity cache with per-table IDs;
ReadOptions::fill_cache - Batch-grouped writes — concurrent writers are grouped under one WAL record, amortising fsync cost
Partial / known limitations:
- Compaction is L0→L1 only. LevelDB uses a level-score system (L1 target 10 MB, 10× per level) to drive
multi-level compaction. RoughDB only compacts when L0 reaches 4 files; L1–L6 can grow unbounded after L0 pressure
subsides.
compact_rangeworks across all levels but automatic background compaction does not. - Compaction runs synchronously on the writing thread. LevelDB runs a background thread so writes are never blocked waiting for compaction I/O. Large compactions will cause P99 write-latency spikes.
- No seek-based compaction. LevelDB tracks per-file seek counts and triggers compaction on hot files regardless
of size; this read-sampling path (
RecordReadSample) is not implemented. - No trivial-move optimisation. When a single file at level N has no overlap with level N+2, LevelDB moves it without merging. RoughDB always merges all selected input files.
- No grandparent-overlap limiting. LevelDB stops growing a compaction output file early when it would overlap
too many level-(N+2) files, reducing future compaction amplification. RoughDB sizes output files by
max_file_sizeonly. - All memtable flushes go to L0. LevelDB's
PickLevelForMemTableOutputcan place a flush directly into L1 or L2 when there is no overlap, reducing L0 pressure. Not implemented. Options::reuse_logsis accepted but ignored.
Not yet implemented:
- Custom comparators — keys are always compared bytewise;
Options::comparatordoes not exist. Applications that require a custom sort order (e.g. reverse, integer, prefix) cannot use RoughDB. Envabstraction — file I/O is hardcoded to the local POSIX filesystem. There is no way to inject a custom storage backend (in-memory, encrypted, cloud, etc.).RepairDB— recovery from a corrupt or partial MANIFEST by scanning surviving SSTables.- Info logging — LevelDB writes compaction progress, recovery details, and errors to an
info_log. RoughDB produces no log output.
License
Apache License 2.0 — see LICENSE.