Skip to main content

Crate emdb

Crate emdb 

Source
Expand description

§emdb

A high-performance embedded key-value database for Rust.

§Architecture

emdb is an fsys-journal-backed append-only KV with a sharded in-memory hash index. Writes go through fsys::JournalHandle’s lock-free LSN reservation + group-commit fsync; reads slice directly into a kernel-managed memory map of the same file (zero-copy). Crash safety is delegated to fsys’s CRC-32C frame validation and five-state tail-truncation taxonomy.

This is the Bitcask family of storage engines (one append-only log + an in-memory index), built on top of fsys for the filesystem substrate. fsys handles platform-specific durability (NVMe passthrough flush on Linux + Windows, io_uring on Linux, WRITE_THROUGH where appropriate); emdb handles the engine-level concerns (per-namespace sharded indices, encryption, range scans, TTL).

Reads are lock-free — the 64-shard primary index plus the Arc<Mmap> zero-copy read path scale to many millions of operations per second on a single open handle. Writes are lock-free via fsys’s atomic LSN reservation; no writer mutex on the hot append path. Producers can still batch through Emdb::insert_many or Emdb::transaction when group semantics matter.

§Quick start

use emdb::Emdb;

let db = Emdb::open_in_memory();
db.insert("name", "emdb")?;
assert_eq!(db.get("name")?, Some(b"emdb".to_vec()));

Persistent file-backed:

use emdb::Emdb;

let path = std::env::temp_dir().join("emdb-doc-example.emdb");
{
    let db = Emdb::open(&path)?;
    db.insert("name", "emdb")?;
    db.flush()?;        // make record bytes durable
    db.checkpoint()?;   // persist tail_hint for fast reopen
}
let db = Emdb::open(&path)?;
assert_eq!(db.get("name")?, Some(b"emdb".to_vec()));

TTL:

use std::time::Duration;

use emdb::{Emdb, Ttl};

let path = std::env::temp_dir().join("emdb-doc-ttl.emdb");
let db = Emdb::builder()
    .path(&path)
    .default_ttl(Duration::from_secs(60))
    .build()?;
db.insert_with_ttl("session", "token", Ttl::Default)?;
assert!(db.ttl("session")?.is_some());

§Zero-copy reads

Emdb::get_zerocopy returns a ValueRef that points directly into the kernel-managed mmap region — no allocation, no copy. Encrypted databases fall back to an owned plaintext buffer inside the same ValueRef type.

use emdb::Emdb;

let db = Emdb::open_in_memory();
db.insert("k", "v")?;
if let Some(v) = db.get_zerocopy("k")? {
    let want: &[u8] = b"v";
    assert!(v == want);
}

§Streaming iteration

Emdb::iter / Emdb::keys yield records lazily, decoding one record per next() call from a snapshot of offsets captured at construction time. Memory use scales with the offset count, not the total value size.

Range queries are opt-in via EmdbBuilder::enable_range_scans; once enabled, Emdb::range_iter / Emdb::range_prefix_iter return streaming iterators backed by a lock-free crossbeam_skiplist::SkipMap secondary index — inserts and range scans run concurrently without a global lock.

§Group-commit durability

Per-record flush() workloads with concurrent writers can opt into the group-commit pipeline so multiple in-flight flush() calls share a single fdatasync:

use emdb::{Emdb, FlushPolicy};

let db = Emdb::builder()
    .flush_policy(FlushPolicy::Group)
    .build()?;

Default policy is FlushPolicy::OnEachFlush, which performs one fdatasync per call — the right choice when there is only one writer thread or when durability is already batched at the application layer.

§Storage path resolution

emdb does not pick a default path for you. You either pass an explicit path, or opt into OS-aware resolution via the builder.

use emdb::Emdb;

// Resolves to:
//   Linux:   $XDG_DATA_HOME/hivedb-kv/sessions.emdb
//   macOS:   ~/Library/Application Support/hivedb-kv/sessions.emdb
//   Windows: %LOCALAPPDATA%\hivedb-kv\sessions.emdb
let db = Emdb::builder()
    .app_name("hivedb-kv")
    .database_name("sessions.emdb")
    .build()?;

§Operational APIs

  • Emdb::stats — point-in-time database introspection (record counts, file size, namespace count). Cheap to call from a per-second health-check loop.
  • Emdb::backup_to — atomic snapshot to a sibling file. The result is a normal openable database, not a dump format.
  • Emdb::lock_holder / Emdb::break_lock — diagnose and recover from stuck advisory lockfiles when a holder dies without releasing.
  • Emdb::checkpoint — explicit fast-reopen checkpoint that persists the file header’s tail_hint.

§Async surface

Opt-in via the async feature. Wraps the sync API in tokio::task::spawn_blocking so blocking I/O never stalls the async-task scheduler. Exposes AsyncEmdb and AsyncNamespace, plus EmdbBuilder::build_async for the builder path.

use emdb::{AsyncEmdb, Emdb};

// Open via the simple constructor.
let db = AsyncEmdb::open("/tmp/users.emdb").await?;
db.insert("alice", "active").await?;
let value = db.get("alice").await?;

// Or build with explicit configuration.
let configured = Emdb::builder()
    .path("/tmp/configured.emdb")
    .enable_range_scans(true)
    .build_async()
    .await?;

Every async method clones the underlying Arc<Emdb> into a spawn_blocking closure; cheap, but each call allocates owned Vec<u8> copies for key/value bytes so the closure can take them by value. For latency-sensitive workloads where the spawn dispatch overhead exceeds the sync cost (e.g. tight get loops on a hot in-memory key), reach for the sync handle via AsyncEmdb::sync_handle and batch via insert_many / range.

Large iterations come in two flavours. iter / keys / range / range_prefix / iter_from / iter_after materialise the full result into an owned Vec before resolving — convenient for small queries. The *_stream variants (iter_stream, keys_stream, range_stream, range_prefix_stream, iter_from_stream, iter_after_stream) return a tokio_stream::wrappers::ReceiverStream backed by a bounded mpsc channel: records arrive incrementally, the blocking pump task respects the consumer’s backpressure, and memory in flight is bounded by the channel depth rather than the namespace size.

§Cargo features

  • ttl (default) — per-record expiration and default_ttl.
  • nested — dotted-prefix group operations and Focus handles.
  • encrypt — AES-256-GCM + ChaCha20-Poly1305 at-rest encryption with raw-key or Argon2id-derived passphrase.
  • asyncAsyncEmdb / AsyncNamespace wrappers via tokio’s spawn_blocking, plus streaming-iterator variants backed by tokio_stream::wrappers::ReceiverStream. Pulls in tokio (rt + rt-multi-thread + macros + sync) and tokio-stream.
  • bench-compare, bench-rocksdb, bench-redis — comparative bench peers (dev-only, never required by application builds).

Structs§

AsyncEmdbasync
Cheap-clone async handle to an Emdb. Every method routes through tokio::task::spawn_blocking so emdb’s blocking I/O never stalls the async-task scheduler.
AsyncNamespaceasync
Cheap-clone async handle scoped to one named namespace inside an AsyncEmdb. The sync crate::Namespace is already Clone with Arc-shared internals; this type simply wraps it and threads every call through spawn_blocking.
Emdb
The primary embedded database handle.
EmdbBuilder
Builder for constructing an Emdb.
EmdbIter
Iterator over (key, value) pairs from Emdb::iter.
EmdbKeyIter
Iterator over keys from Emdb::keys.
EmdbRangeIter
Streaming range iterator returned by Emdb::range_iter and Emdb::range_prefix_iter.
EmdbStats
Point-in-time database statistics.
Focusnested
Scoped database view that prefixes keys with a dotted path segment.
LockHolder
Holder metadata read out of an existing lockfile body. Returned by crate::Emdb::lock_holder for the crate::Emdb::break_lock admin path.
Namespace
Cheap-clone handle scoped to one named namespace inside a single crate::Emdb.
NamespaceIter
Iterator over (key, value) pairs from Namespace::iter.
NamespaceKeyIter
Iterator over keys from Namespace::keys.
NamespaceRangeIter
Streaming range iterator returned by Namespace::range_iter / Namespace::range_prefix_iter.
Transaction
Closure-scoped transaction.
ValueRef
Zero-copy reference to a value stored in the database.

Enums§

Cipherencrypt
Selectable AEAD cipher. Both options use the same 32-byte key, 96-bit nonce, and 128-bit tag, so the on-disk envelope is byte-identical across choices — only the cipher-id bit in the page-store flags differs.
EncryptionInputencrypt
User-supplied keying material for the offline admin operations crate::Emdb::enable_encryption / crate::Emdb::disable_encryption / crate::Emdb::rotate_encryption_key and for the CLI tool.
Error
The top-level error type returned by every fallible operation in emdb.
FlushPolicy
How db.flush() interacts with concurrent flush requests and how each db.insert() interacts with durability.
Ttl
Time-to-live specification for a record.

Type Aliases§

Result
Convenient Result alias where the error type is fixed to Error.