GrumpyDB
A disk-based object storage engine written in Rust. GrumpyDB stores schema-less documents (JSON-like) with B+Tree indexing, page-based storage, WAL for durability, and SWMR concurrency.
Features
| Feature | Status |
|---|---|
| Page-based storage (8 KiB pages, slotted layout, overflow) | ✅ Implemented |
| B+Tree index (search, insert, delete, range scan) | ✅ Implemented |
| Document model (JSON-like Value type, binary codec) | ✅ Implemented |
| Storage engine (CRUD API) | ✅ Implemented |
| Write-Ahead Log (crash recovery) | ✅ Implemented |
| Buffer pool (LRU cache) | ✅ Implemented |
| SWMR concurrency | ✅ Implemented |
| Page checksums (CRC32 integrity) | ✅ Implemented |
| Compaction (defrag + index rebuild) | ✅ Implemented |
| Variable-key B+Tree (secondary indexes) | ✅ Implemented |
| Collection abstraction (unit of storage) | ✅ Implemented |
| Secondary indexes (field-level queries) | ✅ Implemented |
| Multi-collection database | ✅ Implemented |
| Document references (cross-collection Ref, resolve, cycle detection) | ✅ Implemented |
| GrumpyShell interactive REPL | ✅ Implemented |
Getting started
Prerequisites
- Rust (edition 2024)
Build
Test
Lint
Usage
use ;
use Uuid;
use BTreeMap;
let mut db = open.unwrap;
let key = new_v4;
let value = Object;
db.insert.unwrap;
let doc = db.get.unwrap;
assert!;
db.close.unwrap;
Note: The full CRUD API (
insert,get,update,delete,scan) is functional with WAL durability, LRU buffer pool caching, SWMR concurrency, page checksums, and compaction. TheDatabaseAPI provides multi-collection support with secondary indexes.
GrumpyShell — Interactive REPL
GrumpyShell provides a JavaScript-like interactive shell for exploring GrumpyDB:
grumpy> use demo
grumpy > db.
grumpy > db..
Inserted: 3df9dde6-...
grumpy > db..
Index "by_age" created on field "age"
grumpy > db..
grumpy > db..
grumpy > db..
Inserted: a1b2c3d4-...
grumpy > db..
grumpy > db..
Features: relaxed JSON (unquoted keys, single quotes), secondary index queries, client-side filtering, document references ($ref()), reference resolution (resolve, resolveDeep), line editing with history.
Demo App & Tutorial
The examples/taskman/ directory contains a fully documented task manager CLI that demonstrates every GrumpyDB feature:
- Tutorial — 7-chapter guide: getting started, data modeling, querying, updates, durability, performance, concurrency
- Cookbook — 7 self-contained recipes for common tasks (struct storage, iteration, filtering, bulk import, threading, compaction)
- Performance Guide — buffer pool architecture, tuning, and benchmarking
Architecture
┌──────────────────────────────────────┐
│ Public API (lib.rs) │
├──────────────────────────────────────┤
│ Database (database/) + Engine (engine.rs) │
├──────────────────────────────────────┤
│ Collection (collection/) + Indexes │
├────────────┬─────────────┬────────────┤
│ Document │ Concurrency │ Buffer │
│ Model │ (SWMR) │ Pool │
├────────────┼─────────────┼────────────┤
│ B+Tree │ WAL │ Page │
│ Index │ │ Manager │
│(primary.idx)│ (wal.log) │ (data.db) │
└────────────┴─────────────┴────────────┘
See docs/ARCHITECTURE.md for technical details.
Project structure
src/
├── lib.rs # Public API, re-exports
├── error.rs # GrumpyError, Result type
├── engine.rs # GrumpyDb — thin wrapper over Collection + WAL
├── naming.rs # Name validation: [a-z0-9_]{1,64}
├── database/ # Database — multi-collection management
│ └── mod.rs # Database struct, CRUD routing, shared WAL
├── collection/ # Collection — unit of document storage
│ └── mod.rs # Collection struct, raw CRUD, compact, secondary indexes
├── index/ # Secondary indexes on document fields
│ ├── mod.rs # SecondaryIndex struct, IndexDefinition, lookup, range_query
│ └── encoding.rs # Sortable binary encoding for B+Tree keys
├── page/ # 8 KiB page management
│ ├── mod.rs # Constants, PageHeader, PageType
│ ├── manager.rs # PageManager (I/O, free-list)
│ ├── slotted.rs # SlottedPage (variable-length tuples)
│ └── overflow.rs # Overflow page chains
├── btree/ # B+Tree index
│ ├── mod.rs # BTree struct, metadata
│ ├── node.rs # InternalNode, LeafNode (fixed UUID keys)
│ ├── ops.rs # search, insert, delete
│ ├── cursor.rs # BTreeCursor, range scans
│ ├── key.rs # Key encoding utilities
│ ├── var_node.rs # VarInternalNode, VarLeafNode (variable keys)
│ ├── var_ops.rs # VarBTree search/insert/delete
│ ├── var_tree.rs # VarBTree struct
│ └── var_cursor.rs # VarCursor, range scans
├── document/ # Document model
│ ├── mod.rs # Document struct
│ ├── value.rs # Value enum (JSON-like + Ref)
│ └── codec.rs # Binary encode/decode
├── wal/ # Write-Ahead Log
├── buffer/ # Buffer pool LRU cache
└── concurrency/ # SWMR locks
License
MIT