Skip to main content

Crate armdb

Crate armdb 

Source
Expand description

Embedded key-value storage engine optimized for NVMe.

Architecture: Sharded Bitcask (log-structured, append-only). Single process, multi-threaded. Sync read/write API.

Each tree/map owns its storage — one tree = one database directory.

§Tree types

TypeIndexValuesOrdered
ConstTreeSkipListinline [u8; V]yes
[VarTree]SkipList[ByteView] (disk + cache)yes
TypedTreeSkipListtyped T (in-memory, TypedRef)yes
ZeroTreeSkipListzerocopy T (in-memory)yes
TypedMapHashMaptyped T (in-memory, TypedRef)no
ZeroMapHashMapzerocopy T (in-memory)no
ConstMapHashMapinline [u8; V]no
[VarMap]HashMap[ByteView] (disk + cache)no

§Usage

let tree = ConstTree::<[u8; 16], 64>::open("data/users", Config::default())?;

tree.put(&key, &value)?;
let val = tree.get(&key);

tree.close()?;

§Durability

Writes are buffered in memory (write_buffer_size per shard, default 1 MB). put() returns as soon as the entry is copied into the buffer and the in-memory index is updated — no disk I/O on the write path.

This means unflushed data is lost on crash. To control durability:

  • flush_buffers() — flush write buffers to disk (no fsync)
  • close() — flush + fsync + write hint files
  • config.enable_fsync = true — fsync on every buffer flush

For periodic flushing, combine with Compactor:

let tree = Arc::new(ConstTree::<[u8; 16], 64>::open(config)?);
let t = tree.clone();
let _flusher = Compactor::start(
    move || { t.flush_buffers()?; Ok(0) },
    Duration::from_millis(1),
);

§Thread safety

All tree/map types are Send + Sync. Share via Arc for concurrent access:

let tree = Arc::new(ConstTree::<[u8; 8], 8>::open(config)?);
let t = tree.clone();
std::thread::spawn(move || { t.put(&key, &value).unwrap(); });

Reads on ConstTree/ConstMap/ZeroTree/ZeroMap are lock-free (values inline in index). Reads on TypedTree/TypedMap are lock-free (seize RCU guard, no mutex). Reads on VarTree/VarMap are lock-free on cache hit, brief shard lock on miss. Writes acquire a per-shard mutex — different shards never contend.

§Encryption at rest (feature encryption)

Page-level AES-256-GCM encryption. Key from env or direct config:

let key = PageCipher::key_from_env("ARMDB_KEY")?;
let mut config = Config::default();
config.encryption_key = Some(key);
let tree = ConstTree::<[u8; 16], 64>::open("data/encrypted", config)?;
// all data is transparently encrypted on disk

§Write hooks (secondary indexes)

Generic WriteHook<K> / TypedWriteHook<K, T> parameter for synchronous write and init notifications. Zero overhead when unused (NoHook default).

  • on_write — fires on every put/insert/delete/cas/update. Does not fire inside atomic() blocks.
  • on_init — fires once per live entry during collection open (after recovery). Enable via NEEDS_INIT = true. See [armdb/docs/hooks.md] for details.
  • NEEDS_OLD_VALUE — only affects VarTree/VarMap (skips disk I/O for old value when false). In-memory collections (Const/Typed/Zero) always provide the old value at zero cost.
impl WriteHook<[u8; 16]> for MyIndex {
    const NEEDS_OLD_VALUE: bool = true;
    const NEEDS_INIT: bool = true;

    fn on_write(&self, key: &[u8; 16], old: Option<&[u8]>, new: Option<&[u8]>) {
        // update secondary index incrementally
    }

    fn on_init(&self, key: &[u8; 16], value: &[u8]) {
        // populate secondary index at startup
    }
}

let tree = ConstTree::<[u8; 16], 64, MyIndex>::open_hooked("data/indexed", config, my_index)?;
tree.migrate(|_, _| MigrateAction::Keep)?; // triggers on_init for all entries

§Compaction

Compaction is not automatic. Dead bytes accumulate as entries are overwritten or deleted. Use Compactor to run it in the background:

use std::sync::Arc;
use std::time::Duration;

let tree = Arc::new(ConstTree::<[u8; 16], 64>::open(config)?);
let t = tree.clone();
let _compactor = Compactor::start(move || t.compact(), Duration::from_secs(60));

Or call tree.compact() manually when needed.

§Iteration

All Tree types provide iter(), range(), and prefix_iter() methods. Map types (HashMap index) do not support iteration.

Each method returns a dedicated iterator implementing Iterator + DoubleEndedIterator:

TreeIteratorItem
ConstTreeConstIter(K, [u8; V]) — copy
[VarTree][VarIter](K, ByteView) — RC, possible disk I/O
TypedTreeTypedIter(K, &T) — reference, zero I/O
ZeroTreeZeroIter(K, T) — copy, zero I/O
MethodDescription
iter()All entries in index order
range(start, end)Entries in [start, end) — start inclusive, end exclusive
range_bounds(start, end)Entries with custom BoundIncluded, Excluded, or Unbounded
prefix_iter(prefix)Entries whose key starts with prefix
for (key, value) in tree.iter() { /* ... */ }

let latest = tree.prefix_iter(&user_id).take(20).collect::<Vec<_>>();

for (key, value) in tree.range(&start_key, &end_key) { /* ... */ }

// Custom bounds: (5, 10] — exclude 5, include 10
use std::ops::Bound;
let entries: Vec<_> = tree.range_bounds(
    Bound::Excluded(&5u64.to_be_bytes()),
    Bound::Included(&10u64.to_be_bytes()),
).collect();

// DoubleEndedIterator — .rev() or .next_back()
let oldest = tree.prefix_iter(&user_id).rev().take(10);

§Complexity

OperationComplexityNotes
next()O(1)follows SkipList level-0 forward pointer
next_back()O(log n)calls find_last_lt() — SkipList search from top
iter() / range() / prefix_iter() setupO(log n)initial SkipList search

VarIter: both next() and next_back() may additionally perform a pread on block-cache miss. Use warmup() to pre-populate the cache.

§Weakly-consistent semantics

Iterators do not create a snapshot. They are weakly-consistent:

  • Concurrent inserts/updates may be visible during iteration
  • Deleted entries (marked nodes) are automatically skipped
  • The seize guard prevents memory reclamation for the lifetime of the iterator — no use-after-free, but the index is not frozen

§Ordering: Config::reversed

Config::reversed controls the SkipList comparator direction.

reversediter() / prefix_iter().rev() / next_back()
true (default)DESC (newest first)ASC (oldest first)
falseASC (oldest first)DESC (newest first)
// reversed=true (default) — DESC: идеально для "newest first" пагинации
let tree = ConstTree::<[u8; 16], 64>::open("data/posts", Config::default())?;
let latest = tree.prefix_iter(&user_id).take(20);    // newest 20
let oldest = tree.prefix_iter(&user_id).rev().take(5); // oldest 5

// reversed=false — ASC: естественный порядок ключей
let mut config = Config::default();
config.reversed = false;
let tree = ConstTree::<[u8; 16], 64>::open("data/logs", config)?;
for (key, value) in tree.iter() { /* ascending order */ }

Keys are stored on disk as-is. reversed can be changed between restarts without migration — it only affects in-memory index ordering.

§Prefix sharding

let mut config = Config::default();
config.shard_prefix_bits = 32;
let tree = ConstTree::<[u8; 16], 64>::open("data/users", config)?;

§Caveats

  • Iterators are weakly-consistent, not snapshot. Concurrent inserts and updates may be visible during iteration. Deleted entries are skipped. The seize guard prevents use-after-free but does not freeze the index.
  • CAS/update holds shard lock during possible disk I/O. VarTree::cas, VarTree::update, VarMap::cas, and VarMap::update read the current value under the shard mutex. On a block-cache miss this issues a pread — blocking all writes to that shard until the read completes. Pre-warm the cache or size it to cover the working set.
  • migrate() on HashMap trees allocates O(keys/shards) memory. ConstMap::migrate and VarMap::migrate collect all shard keys into a Vec before iterating. For very large shards this causes a transient memory spike. SkipList trees (ConstTree, VarTree) are not affected.

§Shutdown

// 1. Stop background tasks first
compactor.stop();
// 2. Close the tree (writes hint files, flushes, fsyncs)
Arc::try_unwrap(tree).expect("no other references").close()?;

If close() is not called (e.g. the tree is dropped via Arc::drop), Shard::Drop still flushes write buffers, fsyncs, and writes hint files automatically — no data loss and no slow recovery on next startup.

§Features

  • typed-tree — enables TypedTree, TypedMap, Codec and codec implementations
  • rapira-codecRapiraCodec for rapira serialization (implies typed-tree)
  • bytemuck-codecBytemuckCodec / BytemuckSliceCodec (implies typed-tree)
  • bitcode-codec — [BitcodeCodec] (implies typed-tree)
  • encryption — AES-256-GCM page-level encryption at rest
  • replication — leader/follower log-shipping replication
  • armour — integration with armour ecosystem: Db, schema-versioned migrations, binary RPC server (TCP/UDS). See armour module docs.
  • hot-path-tracing — per-operation tracing::trace! calls

Re-exports§

pub use codec::Codec;
pub use codec::RapiraCodec;
pub use codec::ZerocopyCodec;
pub use codec::BytemuckCodec;
pub use codec::BytemuckSliceCodec;
pub use codec::BytemuckVec;
pub use compaction::Compactor;
pub use armour_core as core;

Modules§

armour
Integration with the armour ecosystem (feature armour).
codec
compaction
docs
In-depth documentation for armdb.
replication

Macros§

const_map
Expands to ConstMap<<V as CollectionMeta>::SelfId, { size_of::<V>() } [, H]>.
const_tree
Expands to ConstTree<<V as CollectionMeta>::SelfId, { size_of::<V>() } [, H]>.
impl_key_bytemuck
Implement Key for a type that derives bytemuck::{Pod, Zeroable}.
impl_key_zerocopy
Implement Key for a type that derives zerocopy::{FromBytes, IntoBytes, Immutable}.
typed_map
Expands to TypedMap<<V as CollectionMeta>::SelfId, V, C [, H]>.
typed_tree
Expands to TypedTree<<V as CollectionMeta>::SelfId, V, C [, H]>.
var_map
Expands to VarMap<<V as CollectionMeta>::SelfId [, H]>.
var_tree
Expands to VarTree<<V as CollectionMeta>::SelfId [, H]>.
zero_map
Expands to ZeroMap<<V as CollectionMeta>::SelfId, { size_of::<V>() }, V [, H]>.
zero_tree
Expands to ZeroTree<<V as CollectionMeta>::SelfId, { size_of::<V>() }, V [, H]>.

Structs§

Config
Database configuration.
ConstIter
Iterator over entries in a ConstTree. Returned by iter(), range(), and prefix_iter().
ConstMap
A map with fixed-size keys and values. All values are stored inline in a per-shard HashMap. Reads never touch disk — zero I/O reads. O(1) lookup instead of O(log n) SkipList. No ordered iteration — use ConstTree if you need prefix/range scans.
ConstMapShard
Handle for atomic multi-key operations within a single shard. Obtained via ConstMap::atomic. The shard + index locks are held for the lifetime of this struct — keep the closure short.
ConstShard
Handle for atomic multi-key operations within a single shard. Obtained via ConstTree::atomic. The shard lock is held for the lifetime of this struct — keep the closure short.
ConstTree
A tree with fixed-size keys and values. All values are stored inline in SkipList nodes. Reads never touch disk — zero I/O reads.
EntryHeader
On-disk entry header. 16 bytes, 8-byte aligned, no padding.
HintEntry
A parsed hint entry.
NoHook
Default no-op hook. All branches are eliminated at compile time.
TreeMeta
Metadata about a named tree/map collection.
TypedIter
Iterator over entries in a TypedTree. Returned by iter(), range(), and prefix_iter().
TypedMap
A map with fixed-size keys and typed values T. Values are encoded via a Codec for disk persistence but stored as T in memory — reads never touch disk and return TypedRef<T> (guard-protected reference).
TypedMapShard
Handle for atomic multi-key operations within a single shard. Obtained via TypedMap::atomic. The shard + index locks are held for the lifetime of this struct — keep the closure short.
TypedRef
Guard-protected reference to a typed value inside a TypedTree.
TypedShard
Handle for atomic multi-key operations within a single shard. Obtained via TypedTree::atomic. The shard lock is held for the lifetime of this struct — keep the closure short.
TypedTree
A tree with fixed-size keys and typed values T. Values are encoded via a Codec for disk persistence but stored as T in memory — reads never touch disk and return TypedRef<T> (guard-protected reference).
ZeroIter
Iterator over entries in a ZeroTree. Wraps ConstIter and converts [u8; V] values to T via zerocopy.
ZeroMap
A map with fixed-size keys and zerocopy-compatible typed values.
ZeroMapShard
Handle for atomic multi-key operations within a single shard. Obtained via ZeroMap::atomic.
ZeroShard
Handle for atomic multi-key operations within a single shard. Obtained via ZeroTree::atomic.
ZeroTree
A tree with fixed-size keys and zerocopy-compatible values.

Enums§

DbError
MigrateAction
Action returned by the migrate() callback for each entry.

Constants§

TOMBSTONE_BIT

Traits§

CollectionMeta
Trait for types that carry enough metadata to describe an armdb collection.
Key
Key metadata trait — describes key encoding for armdb collections.
TypedWriteHook
Typed write hook for TypedTree.
WriteHook
Trait for receiving write notifications from tree/map operations.

Functions§

compute_crc32
Compute CRC32 over gsn || value_len || key || value.
entry_size
Compute the total on-disk size of an entry including padding to 8-byte alignment.
hint_entry_size
Size of a single hint entry: GSN(8) + Key(key_len) + Offset(8) + Len(4).
parse_hint_entries
Parse hint entries from raw hint file bytes.
serialize_entry
Serialize a complete entry (header + key + value + padding) into a Vec<u8>.

Type Aliases§

DbResult