Cuendillar

Cuendillar is an embedded, persistent key–value storage engine written in Rust.
It is designed to preserve application state safely and predictably across time, without requiring an external database.

Inspired by cuendillar (heartstone) — a material that cannot be broken or degraded — the project focuses on durability, immutability, and crash safety.

Motivation

Many applications need reliable local state:

Checkpoints and offsets
Persistent caches
Offline-first or embedded applications

Cuendillar targets these use cases by providing a lightweight, embeddable storage engine with a simple API .

Design Overview

Cuendillar follows an LSM-tree–based architecture optimized for fast writes and durable storage.

Key components include:

Memtable — in-memory structure for recent writes
Write-Ahead Log (WAL) — append-only durability layer
SSTables — immutable sorted files on disk
Compaction — background merge process
Crash Recovery — deterministic rebuild from WAL + SSTables

Features

Durable writes with configurable WAL sync modes
Pluggable memtable implementations (btree, vector, hash)
Bloom filters for read optimization
Sorted iteration and range scans
Background compaction and cleaning
Configurable LSM-tree layout

Quick Start

use cuendillar::{Database, DbConfig};
use std::sync::Arc;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = DbConfig::get_config()?;
    let db = Database::new(config)?;

    db.put(b"key", b"value")?;
    
    if let Some(entry) = db.get(b"key")? {
        println!("{:?}", entry);
    }

    Ok(())
}

Cuendillar library guide (third-party entry points)

This section describes the stable surface exposed from lib.rs for applications and bindings.

For engine tuning, see docs/CONFIG_TUNING.md. For benchmarks, see BENCHMARK.md.

Add as a dependency

Path (local development)

[dependencies]
cuendillar = { path = "../cuendillar" }

Crates.io — use the published name and version when the crate is released.

Crate root re-exports

The following names are available directly under cuendillar:::

Name	Role
`Database`	Main handle: open, get, put, delete, range iterator.
`DbConfig`	Full engine configuration (paths, WAL, memtable, bloom, index, compaction, cleaner, version manager).
`EngineError`	Error type returned by `Database` operations and `Database::new`.
`OwnedEntry`	Owned key–value or tombstone returned by `get` and iterators.
`DatabaseIterator`	Trait implemented by the boxed iterator from `Database::iter`.
`config`	Module re-export; same as `cuendillar::config` for nested config types (`wal_config`, `memtable_config`, …).

Submodules such as database::db_engine remain crate-private; depend only on the items above unless you fork the crate.

Configuration

File — By default, DbConfig::get_config() reads ./default_config.toml. Override with the CONFIG_PATH environment variable.
Programmatic defaults — DbConfig::get_dynamic_defaults(root_dir, sstable_root_dir) fills in path-dependent defaults; merge with your own Figment / serde layer if you do not use a TOML file.
Validation — Call config.validate() before use, or rely on get_config() which validates after merge.

use cuendillar::{Database, DbConfig};
use std::sync::Arc;

let config = DbConfig::get_config()?;
let db = Database::new(config)?;

Database

Database is Clone; clones share the same underlying engine (Arc + RwLock).

Opening

Database::new(config: Arc<DbConfig>) -> Result<Self, EngineError>
Opens or creates storage under the configured directories, replays the WAL, and starts background workers as implemented by the engine.

`Reads and writes`

Method	Signature (simplified)	Notes
`get`	`fn get(&self, key: &[u8]) -> Result<Option<OwnedEntry>, EngineError>`	Shared read lock on the engine.
`put`	`fn put(&self, key: &[u8], value: &[u8]) -> Result<u64, EngineError>`	WAL + memtable; returns a sequence number. Empty `value` is a tombstone (logical delete).
`delete`	`fn delete(&self, key: &[u8]) -> Result<u64, EngineError>`	Writes a tombstone (same as `put` with empty value).
`iter`	`fn iter(&self, start: Option<&[u8]>, end: Option<&[u8]>) -> Result<Box<dyn DatabaseIterator>, EngineError>`	Inclusive start, exclusive end. Full range: `iter(None, None)`. If both bounds are `Some` and `start > end`, returns `EngineError::InvalidRange`. The read lock is held only while building the iterator.

`Tombstones and deletes`

put(key, &[]) and delete(key) both record deletion markers; physical removal happens during compaction.
get returns Some(OwnedEntry::Tombstone { .. }) when the latest visible version for that key is a tombstone, Some(OwnedEntry::Row { .. }) when the key has a value, and None when the key is absent. Application code usually treats tombstones like a missing key for business logic.

OwnedEntry

Enum of:

Row { seq_no, key, value } — live key–value.
Tombstone { seq_no, key } — deleted key at that sequence.

Helpers include get_key(), get_seq_no(), encode / decode for a binary record layout, and Debug.

DatabaseIterator

Returned as Box<dyn DatabaseIterator>. The trait provides:

peek, next_owned, first_entry, last_entry (see database/iterator for slice vs owned semantics).
as_iterator() — adapter to Iterator<Item = OwnedEntry>.

Box<dyn DatabaseIterator> also implements Iterator<Item = OwnedEntry> (delegating to next_owned), for example:

let mut it = db.iter(Some(b"a"), Some(b"z"))?;
while let Some(entry) = it.next() {
    let _key = entry.get_key();
}

EngineError

General
Internal(String)
PosionError          // RwLock poisoned
IoError(std::io::Error)
InvalidRange         // bad iterator bounds

Implements Debug (not Error / Display ). For interoperability, map with format!("{:?}", err) or wrap in your application error type.

Threading and async

The handle is designed for shared access across threads via Clone and interior mutability on the engine. Individual method contracts (e.g. how much true concurrency you get on writes) follow the current RwLock usage inside the engine. There is no async API in the public crate root; run blocking calls on a thread pool if needed.

Stability

Public types and methods on Database and the re-exports listed above are the intended integration surface. Internal modules may change between versions. For reproducible workloads and CLI-style benchmarks, see the db_bench_rocksdb_compatible bench and benches/doc.md.

Example application (path dependency)

The workspace member examples/cuendillar_example_kv is an interactive kv> shell (and optional one-shot subcommands) that depends on cuendillar like an external crate (path = "../.."). It covers config loading, CRUD and scans. See /examples/cuendillar_example_kv/README.md.

Benchmarks

Cuendillar provides three benchmarking binaries:

Benchmark	Purpose
`db_workload_operation`	Trace replay with latency histograms
`db_workload_operation_summerize`	Lightweight summary report
`db_bench_rocksdb_compatible`	RocksDB-style benchmarks

Example

cargo bench --bench db_bench_rocksdb_compatible --   --benchmarks=fillrandom,readrandom   --num=1000000   --seed=1

Benchmark Snapshot (2026-03-23)

Dataset	Write Throughput	Read Throughput
100M	~296K ops/s	~10K ops/s
50M	~297K ops/s	~12K ops/s
30M	~308K ops/s	~115K ops/s
10M	~307K ops/s	~149K ops/s
1M	~340K ops/s	~557K ops/s

For more details you can see FULL_REPORT,BENCHMARK_DETAILS and ROCKS_DB_BENCHMARK_DETAILS.

Testing

Run integration tests (single-threaded due to shared DB directory):

cargo test -- --test-threads=1

Or specific:

cargo test --test db_engine -- --test-threads=1

For more details you can refer to INTEGRATION_TEST.

Configuration Highlights

Important rules:

compaction.root_dir == cleaning.root_dir
WAL file size ≥ 10× max payload
memtable ≥ 1MB

Key tunables:

Area	Impact
WAL sync	durability vs performance
Memtable size	write batching vs memory
Bloom bits	memory vs read amplification

See full guide: docs/CONFIG_TUNING.md

Project Structure

src/                 # Core engine
benches/             # Benchmark implementations
tests/               # Integration tests
docs/                # Documentation
configs/             # Example configs
bench_result/        # Benchmark outputs
examples/            # Demo applications

Use Cases

Embedded systems
Local-first apps
Persistent caches
CLI tools
Background agents

Documentation

Configuration guide → docs/CONFIG_TUNING.md
Benchmark details → docs/BENCHMARK.md
Benchmark Snapshot → BENCHMARK.md
RocksDB Benchmarks → docs/ROCKS_DB_BENCHMARK.md
Integration tests → docs/DB_ENGINE_INTEGERATION_TEST.md

cuendillar 0.1.0