Expand description
§emdb
A high-performance embedded key-value database for Rust.
§Architecture
emdb is an fsys-journal-backed append-only KV with a sharded
in-memory hash index. Writes go through fsys::JournalHandle’s
lock-free LSN reservation + group-commit fsync; reads slice
directly into a kernel-managed memory map of the same file
(zero-copy). Crash safety is delegated to fsys’s CRC-32C frame
validation and five-state tail-truncation taxonomy.
This is the Bitcask family of storage engines (one append-only
log + an in-memory index), built on top of fsys for the
filesystem substrate. fsys handles platform-specific durability
(NVMe passthrough flush on Linux + Windows, io_uring on Linux,
WRITE_THROUGH where appropriate); emdb handles the
engine-level concerns (per-namespace sharded indices,
encryption, range scans, TTL).
Reads are lock-free — the 64-shard primary index plus the
Arc<Mmap> zero-copy read path scale to many millions of
operations per second on a single open handle. Writes are
lock-free via fsys’s atomic LSN reservation; no writer mutex
on the hot append path. Producers can still batch through
Emdb::insert_many or Emdb::transaction when group
semantics matter.
§Quick start
use emdb::Emdb;
let db = Emdb::open_in_memory();
db.insert("name", "emdb")?;
assert_eq!(db.get("name")?, Some(b"emdb".to_vec()));Persistent file-backed:
use emdb::Emdb;
let path = std::env::temp_dir().join("emdb-doc-example.emdb");
{
let db = Emdb::open(&path)?;
db.insert("name", "emdb")?;
db.flush()?; // make record bytes durable
db.checkpoint()?; // persist tail_hint for fast reopen
}
let db = Emdb::open(&path)?;
assert_eq!(db.get("name")?, Some(b"emdb".to_vec()));TTL:
use std::time::Duration;
use emdb::{Emdb, Ttl};
let path = std::env::temp_dir().join("emdb-doc-ttl.emdb");
let db = Emdb::builder()
.path(&path)
.default_ttl(Duration::from_secs(60))
.build()?;
db.insert_with_ttl("session", "token", Ttl::Default)?;
assert!(db.ttl("session")?.is_some());§Zero-copy reads
Emdb::get_zerocopy returns a ValueRef that points directly
into the kernel-managed mmap region — no allocation, no copy.
Encrypted databases fall back to an owned plaintext buffer inside
the same ValueRef type.
use emdb::Emdb;
let db = Emdb::open_in_memory();
db.insert("k", "v")?;
if let Some(v) = db.get_zerocopy("k")? {
let want: &[u8] = b"v";
assert!(v == want);
}§Streaming iteration
Emdb::iter / Emdb::keys yield records lazily, decoding one
record per next() call from a snapshot of offsets captured at
construction time. Memory use scales with the offset count, not
the total value size.
Range queries are opt-in via
EmdbBuilder::enable_range_scans; once enabled,
Emdb::range_iter / Emdb::range_prefix_iter return streaming
iterators backed by a lock-free crossbeam_skiplist::SkipMap
secondary index — inserts and range scans run concurrently
without a global lock.
§Group-commit durability
Per-record flush() workloads with concurrent writers can opt
into the group-commit pipeline so multiple in-flight flush()
calls share a single fdatasync:
use emdb::{Emdb, FlushPolicy};
let db = Emdb::builder()
.flush_policy(FlushPolicy::Group)
.build()?;Default policy is FlushPolicy::OnEachFlush, which performs one
fdatasync per call — the right choice when there is only one
writer thread or when durability is already batched at the
application layer.
§Storage path resolution
emdb does not pick a default path for you. You either pass an explicit path, or opt into OS-aware resolution via the builder.
use emdb::Emdb;
// Resolves to:
// Linux: $XDG_DATA_HOME/hivedb-kv/sessions.emdb
// macOS: ~/Library/Application Support/hivedb-kv/sessions.emdb
// Windows: %LOCALAPPDATA%\hivedb-kv\sessions.emdb
let db = Emdb::builder()
.app_name("hivedb-kv")
.database_name("sessions.emdb")
.build()?;§Operational APIs
Emdb::stats— point-in-time database introspection (record counts, file size, namespace count). Cheap to call from a per-second health-check loop.Emdb::backup_to— atomic snapshot to a sibling file. The result is a normal openable database, not a dump format.Emdb::lock_holder/Emdb::break_lock— diagnose and recover from stuck advisory lockfiles when a holder dies without releasing.Emdb::checkpoint— explicit fast-reopen checkpoint that persists the file header’stail_hint.
§Async surface
Opt-in via the async feature. Wraps the sync API in
tokio::task::spawn_blocking so blocking I/O never stalls the
async-task scheduler. Exposes AsyncEmdb and AsyncNamespace,
plus EmdbBuilder::build_async for the builder path.
use emdb::{AsyncEmdb, Emdb};
// Open via the simple constructor.
let db = AsyncEmdb::open("/tmp/users.emdb").await?;
db.insert("alice", "active").await?;
let value = db.get("alice").await?;
// Or build with explicit configuration.
let configured = Emdb::builder()
.path("/tmp/configured.emdb")
.enable_range_scans(true)
.build_async()
.await?;Every async method clones the underlying Arc<Emdb> into a
spawn_blocking closure; cheap, but each call allocates owned
Vec<u8> copies for key/value bytes so the closure can take
them by value. For latency-sensitive workloads where the
spawn dispatch overhead exceeds the sync cost (e.g. tight
get loops on a hot in-memory key), reach for the sync
handle via AsyncEmdb::sync_handle and batch via
insert_many / range.
Large iterations come in two flavours. iter / keys / range
/ range_prefix / iter_from / iter_after materialise the
full result into an owned Vec before resolving — convenient
for small queries. The *_stream variants
(iter_stream, keys_stream, range_stream,
range_prefix_stream, iter_from_stream, iter_after_stream)
return a tokio_stream::wrappers::ReceiverStream backed by a
bounded mpsc channel: records arrive incrementally, the
blocking pump task respects the consumer’s backpressure, and
memory in flight is bounded by the channel depth rather than
the namespace size.
§Cargo features
ttl(default) — per-record expiration anddefault_ttl.nested— dotted-prefix group operations andFocushandles.encrypt— AES-256-GCM + ChaCha20-Poly1305 at-rest encryption with raw-key or Argon2id-derived passphrase.async—AsyncEmdb/AsyncNamespacewrappers via tokio’sspawn_blocking, plus streaming-iterator variants backed bytokio_stream::wrappers::ReceiverStream. Pulls intokio(rt+rt-multi-thread+macros+sync) andtokio-stream.bench-compare,bench-rocksdb,bench-redis— comparative bench peers (dev-only, never required by application builds).
Structs§
- Async
Emdb async - Cheap-clone async handle to an
Emdb. Every method routes throughtokio::task::spawn_blockingso emdb’s blocking I/O never stalls the async-task scheduler. - Async
Namespace async - Cheap-clone async handle scoped to one named namespace inside
an
AsyncEmdb. The synccrate::Namespaceis alreadyClonewithArc-shared internals; this type simply wraps it and threads every call throughspawn_blocking. - Emdb
- The primary embedded database handle.
- Emdb
Builder - Builder for constructing an
Emdb. - Emdb
Iter - Iterator over
(key, value)pairs fromEmdb::iter. - Emdb
KeyIter - Iterator over keys from
Emdb::keys. - Emdb
Range Iter - Streaming range iterator returned by
Emdb::range_iterandEmdb::range_prefix_iter. - Emdb
Stats - Point-in-time database statistics.
- Focus
nested - Scoped database view that prefixes keys with a dotted path segment.
- Lock
Holder - Holder metadata read out of an existing lockfile body. Returned
by
crate::Emdb::lock_holderfor thecrate::Emdb::break_lockadmin path. - Namespace
- Cheap-clone handle scoped to one named namespace inside a single
crate::Emdb. - Namespace
Iter - Iterator over
(key, value)pairs fromNamespace::iter. - Namespace
KeyIter - Iterator over keys from
Namespace::keys. - Namespace
Range Iter - Streaming range iterator returned by
Namespace::range_iter/Namespace::range_prefix_iter. - Transaction
- Closure-scoped transaction.
- Value
Ref - Zero-copy reference to a value stored in the database.
Enums§
- Cipher
encrypt - Selectable AEAD cipher. Both options use the same 32-byte key, 96-bit nonce, and 128-bit tag, so the on-disk envelope is byte-identical across choices — only the cipher-id bit in the page-store flags differs.
- Encryption
Input encrypt - User-supplied keying material for the offline admin operations
crate::Emdb::enable_encryption/crate::Emdb::disable_encryption/crate::Emdb::rotate_encryption_keyand for the CLI tool. - Error
- The top-level error type returned by every fallible operation in
emdb. - Flush
Policy - How
db.flush()interacts with concurrent flush requests and how eachdb.insert()interacts with durability. - Ttl
- Time-to-live specification for a record.