Crate seerdb

Crate seerdb 

Source
Expand description

seerdb - Research-grade embedded storage engine

A modern LSM-tree based key-value storage engine implementing 2018-2024 research on learned data structures, workload-aware optimization, and efficient key-value separation.

§Features

  • LSM-tree architecture: Write-optimized with efficient compaction
  • Durability: Write-ahead logging with configurable sync policies
  • Concurrency: Lock-free reads with concurrent writes
  • Observability: Built-in metrics, health checks, and structured logging
  • Key-Value Separation: WiscKey-style vLog for large values (reduces write amplification)
  • Background Compaction: Non-blocking async compaction for better write throughput

§Quick Start

use seerdb::DB;

// Open database with default options
let db = DB::open("./my_database")?;

// Write data
db.put(b"hello", b"world")?;

// Read data
let value = db.get(b"hello")?;
assert_eq!(value, Some(bytes::Bytes::from("world")));

// Delete data
db.delete(b"hello")?;

§Configuration

The defaults work well for most cases. Customize only what you need:

use seerdb::{DBOptions, SyncPolicy};

// Customize specific options
let db = DBOptions::default()
    .memtable_capacity(512 * 1024 * 1024)  // 512MB write buffer
    .open("./my_database")?;

// Or use a preset profile
let db = DBOptions::high_throughput()
    .open("./my_database")?;

See DBOptions for all configuration options and profiles.

§Architecture

seerdb uses an LSM-tree architecture with the following components:

  • Memtable: In-memory buffer using concurrent skiplist
  • WAL: Write-ahead log for durability
  • SSTable: Sorted string tables on disk with bloom filters
  • LSM Levels: 7 levels with exponential sizing (10x ratio)
  • VLog: Optional value log for key-value separation (large values)
  • Compaction: Background merge of SSTables to reduce read amplification

§Performance Characteristics

  • Writes: O(log n) in-memory + O(1) WAL append
  • Reads: O(log n) skiplist + O(levels) SSTable lookups with bloom filter optimization
  • Scans: Efficient via merge iteration over memtable + SSTables
  • Space Amplification: ~2x (typical LSM-tree)
  • Write Amplification: 10-30x (reduced with vLog for large values)

§Durability Guarantees

seerdb provides configurable durability via SyncPolicy:

PolicySurvivesPerformance
SyncAllPower loss~4 ms
SyncDataPower loss~5 µs Linux, ~4 ms macOS
BarrierApp crash~5 µs Linux, ~0.3 ms macOS
NoneNothing~4 µs

macOS note: SyncData is slow on macOS due to APFS. Use Barrier for high-throughput writes when power-loss durability isn’t required. See SyncPolicy for details.

§Observability

Built-in metrics and health checks for production deployment:

// Get current database statistics
let stats = db.stats();
println!("Operations: {} reads, {} writes", stats.total_reads, stats.total_writes);

// Check database health
let health = db.health();
println!("Health: {:?}", health);

Re-exports§

pub use db::DBError;
pub use db::DBOptions;
pub use db::ReadOptions;
pub use db::WriteOptions;
pub use db::DB;
pub use batch::Batch;
pub use scan::Scan;
pub use scan::ScanIterator;
pub use snapshot::Snapshot;
pub use transaction::Transaction;
pub use transaction::TransactionConflict;
pub use merge_operator::MergeOperator;
pub use merge_operator::StringAppendOperator;
pub use health::CheckStatus;
pub use health::HealthCheck;
pub use health::HealthStatus;
pub use metrics::DBStats;
pub use db::BulkLoadOptions;
pub use db::BulkLoadStats;
pub use db::VerifyResult;

Modules§

batch
db
health
merge_operator
metrics
scan
Scan builder for flexible range queries
snapshot
Point-in-time consistent snapshots for seerdb
transaction
Optimistic Concurrency Control (OCC) Transactions

Macros§

fail_point
Failpoint macro - compiles to nothing without the feature

Enums§

CompressionType
Compression algorithm for SSTable blocks
RecoveryMode
Recovery mode for WAL replay during DB::open()
SyncPolicy
WAL durability policy controlling when data is persisted to disk.