GrumpyDB
A document-oriented object database written in Rust.
GrumpyDB stores schema-less JSON-like documents on disk with B+Tree indexing, page-based storage, WAL durability, and multi-tenant isolation. It can be used as an embedded library (linked directly into your Rust app) or as a standalone server accessed over TCP+TLS with JWT authentication and role-based access control.
Quick Start
Embedded — no server needed
grumpy> use myapp
Switched to database "myapp"
grumpy > db.
Collection "users" created
grumpy > db..
Inserted: 3df9dde6-...
grumpy > db..
Inserted: e7f8a9b0-...
grumpy > db..
grumpy > db..
Index "by_age" created on field "age"
grumpy > db..
grumpy > db..
Client/Server — multi-tenant with auth
# Terminal 1: Start the server (first start requires --bootstrap-password)
# Terminal 2: Connect with the shell
Connected to GrumpyDB at localhost:6380
Authenticated as admin@_system
grumpy> use myapp
Switched to database "myapp"
grumpy > db..
Inserted: a1b2c3d4-...
grumpy > db..
1
Use as a Rust Library
Add GrumpyDB to your Cargo.toml:
[]
= "5"
Single-collection (simple key-value)
use ;
use Uuid;
use BTreeMap;
let mut db = open.unwrap;
db.create_collection.unwrap;
let key = new_v4;
let doc = Object;
db.insert.unwrap;
let result = db.get.unwrap;
assert!;
db.close.unwrap;
Note: the legacy
GrumpyDbsingle-collection wrapper is deprecated in v5 and will be removed in v6. New code should useDatabase(with the_defaultcollection if a single collection is enough).
Multi-collection with secondary indexes
use Database;
let mut db = open.unwrap;
db.create_collection.unwrap;
db.create_index.unwrap;
let key = new_v4;
db.insert.unwrap;
// Query by index
let results = db.query.unwrap;
db.close.unwrap;
Thread-safe concurrent access
use SharedDatabase;
let db = open.unwrap;
// Clone is cheap (Arc), share across threads
let db2 = db.clone;
spawn;
let count = db.document_count.unwrap;
grumpy-repl
An interactive REPL with JavaScript-like syntax, relaxed JSON (unquoted keys, single quotes, trailing commas), and line editing with history.
# Embedded (no server)
# Connected (TCP)
Commands
| Category | Commands |
|---|---|
| Database | use <name> |
| Collections | db.createCollection("x"), db.dropCollection("x"), db.collections() |
| CRUD | db.x.insert({...}), db.x.get("id"), db.x.find(), db.x.find({age: 30}), db.x.update("id", {...}), db.x.delete("id"), db.x.count() |
| Indexes | db.x.createIndex("name", "field"), db.x.query("name", value), db.x.queryRange("name", start, end), db.x.indexes() |
| References | $ref("coll", "uuid"), db.x.resolve("id"), db.x.resolveDeep("id") |
| Maintenance | db.x.compact(), db.x.stats(), db.flush() |
Server
Architecture
Clients (grumpy-repl, Rust driver, TypeScript driver, nc/telnet)
│
│ TCP + TLS 1.3 (rustls)
│ RESP-like text protocol
│ JWT authentication
│
┌───▼──────────────────────────────────────────┐
│ GrumpyDB Server │
│ ┌─────────────────────────────────────────┐ │
│ │ TLS · Protocol Parser · RBAC Enforcer │ │
│ └────────────────┬────────────────────────┘ │
│ ┌────────────────▼────────────────────────┐ │
│ │ Auth Store (argon2 + JWT HS256) │ │
│ └────────────────┬────────────────────────┘ │
│ ┌────────────────▼────────────────────────┐ │
│ │ Engine: Tenants · Databases · │ │
│ │ Collections · B+Tree · WAL · Buffer │ │
│ └─────────────────────────────────────────┘ │
└──────────────────────────────────────────────┘
Running the server
# Plaintext (dev) — first start REQUIRES --bootstrap-password
# TLS (auto-generates self-signed cert) — first start REQUIRES --bootstrap-password
# With config file
# Subsequent starts: no --bootstrap-password needed once users exist on disk
You can also provide the bootstrap password via the environment variable
GRUMPYDB_BOOTSTRAP_PASSWORD instead of the CLI flag.
First-start bootstrap
On a brand-new data directory, the server creates a single _system/admin
user with the password you supplied via --bootstrap-password (or
GRUMPYDB_BOOTSTRAP_PASSWORD). If you start the server without providing one
on a clean data directory, it refuses to start with
AuthError::BootstrapRefused — there is no longer a silent admin/admin
default.
The auth secret (<data_dir>/_auth/secret.key) is created with mode 0600
on Unix; existing files with looser permissions are re-tightened with a
warning logged on startup.
Configuration (grumpydb.toml)
[]
= "0.0.0.0:6380"
= 1024
= "./data"
[]
= true
# cert_file = "server.crt" # auto-generated if absent
# key_file = "server.key"
[]
= 3600 # 1 hour
= 604800 # 7 days
User & tenant management
Connect as server admin via nc localhost 6380:
LOGIN _system admin <your-bootstrap-password>
TOKEN <jwt>
CREATE TENANT acme
CREATE USER alice@acme s3cr3t
GRANT tenant_admin ON @acme TO alice@acme
LIST TENANTS
LIST USERS @acme
Notation
| Syntax | Meaning |
|---|---|
alice |
User alice in current tenant |
alice@acme |
User alice in tenant acme |
mydb |
Database (or collection if USE is active) |
mydb@acme |
Database in tenant acme |
users:mydb |
Collection users in database mydb |
users:mydb@acme |
Collection in database in tenant |
@acme |
Tenant scope (for GRANT/REVOKE) |
Consistency and topology protocol (Phase 40f)
The TCP protocol now exposes coordinator and consistency-locking primitives:
TOPOLOGYreturns a JSON cluster snapshot for smart clients.READ_CONCERN R=<n>/WRITE_CONCERN W=<n>can prefix data commands.PUT_WITH_VC <collection> <uuid> <json> <vector_clock>is accepted for reconciled writes (vector clock validated as JSON).
In v5, the server is intentionally locked to single-owner consistency
(N=1, R=1, W=1):
- Non-default concerns are rejected with
v5 only supports R=1, W=1. - If a request targets a key owned by another node, the server returns
forward to <node>@<addr>; not the owner.
RBAC roles
| Role | Permissions |
|---|---|
server_admin |
Everything (cross-tenant) |
tenant_admin |
Manage databases, users, full CRUD within tenant |
db_admin |
Manage collections, indexes, CRUD within a database |
read_write |
INSERT, GET, UPDATE, DELETE, SCAN, QUERY |
read_only |
GET, SCAN, QUERY |
HTTP endpoints (observability)
The server runs a small HTTP server on a separate port (default
0.0.0.0:6381) for orchestrators and Prometheus. No authentication
on these endpoints by design — they are meant for k8s probes and
metrics scraping. Set bind = "" in the [http] section of the config
to disable the HTTP server entirely.
# Liveness — process is up
# (200 OK)
# Readiness — TCP listener has bound
# 200 (or 503 during early startup)
# Prometheus metrics
|
Initial metric catalog (every series is described up-front):
grumpydb_connections_active, grumpydb_commands_total{cmd,result},
grumpydb_command_duration_seconds{cmd},
grumpydb_login_failures_total{reason},
grumpydb_rate_limit_hits_total{kind}, plus
grumpydb_buffer_pool_pages{state} and grumpydb_wal_records_total
(described in v5, will start moving once the engine grows the
corresponding hooks).
Client Drivers
Rust (grumpydb-client)
use GrumpyClient;
let mut client = connect.await?;
client.set_jwks_url;
client.login.await?;
let db = client.database.await?;
let key = new_v4;
db.insert.await?;
let doc = db.get.await?;
TypeScript (@grumpydb/client)
import { GrumpyClient } from '@grumpydb/client';
const client = await GrumpyClient.connect({
host: 'localhost', port: 6380, tls: false,
tenant: 'acme', username: 'alice', password: 's3cr3t',
jwksUrl: 'http://localhost:6381/.well-known/jwks.json',
});
const db = client.database('myapp');
await db.insert('users', crypto.randomUUID(), { name: 'Bob' });
const doc = await db.get('users', '<uuid>');
await client.close();
More TypeScript driver details and examples:
drivers/typescript/README.md.
Storage Engine
Under the hood, GrumpyDB is a page-based storage engine:
- 8 KiB pages with slotted layout and overflow chains for large documents
- B+Tree indexes — fixed-key (UUID primary) and variable-key (secondary)
- Write-Ahead Log for crash recovery (before-image undo)
- Buffer pool with LRU eviction and dirty page tracking
- SWMR concurrency — one writer or many readers per database
- Compaction — defragments data pages and rebuilds indexes
- Document references —
$ref("collection", "uuid")with cycle-safe resolution
On-disk layout
<data_dir>/
_auth/ # JWT secret + user records
<tenant>/
<database>/
wal.log # Write-Ahead Log
<collection>/
data.db # Slotted pages (documents)
primary.idx # B+Tree: UUID → (page, slot)
idx_<name>.idx # Secondary B+Tree indexes
See docs/ARCHITECTURE.md for full technical details.
Building & Testing
Demo App
The examples/taskman/ directory is a complete task manager CLI demonstrating every engine feature:
- Tutorial — 7-chapter guide
- Cookbook — recipes for common patterns
- Performance Guide — buffer pool tuning
Running with Docker
A docker compose stack ships server + Prometheus + Grafana for local
development. Demo only — not production.
# Set the bootstrap password for the first-start admin user
# (edit .env and pick a strong password)
# Server only
# Connect with the REPL (uses --profile repl so it's opt-in)
# Full stack with Prometheus (:9090) + Grafana (:3000, admin/admin)
Multi-arch builds via docker buildx:
The server container also exposes the observability HTTP server on
port 6381 — /healthz, /readyz, /metrics. Prometheus is
pre-configured to scrape it (see docker/prometheus.yml); Grafana
ships with the Prometheus datasource provisioned (login admin/admin
on first run).
For v5 migration and clustering demo assets:
- Migration guide:
docs/MIGRATING_4_to_5.md - 3-node demo compose:
docker-compose.cluster.yml - Cluster smoke test script:
scripts/smoke_cluster.sh - Demo node configs:
docker/cluster/node1.toml,docker/cluster/node2.toml,docker/cluster/node3.toml
Quick smoke run (uses GRUMPYDB_BOOTSTRAP_PASSWORD=admin by default):
# override password and keep the cluster up for manual checks:
GRUMPYDB_BOOTSTRAP_PASSWORD=monsecret
Backup & Restore
The grumpydb-server binary ships snapshot and restore
subcommands that produce/consume a single tar.gz archive (with a
checksummed snapshot.json manifest at the root). Local destinations
are always available; cloud destinations are gated by Cargo features.
# Local (no extra features required)
# Restore refuses to overwrite a non-empty data dir without --force:
# AWS S3 (requires --features cloud-aws; uses the standard AWS credential chain)
# Azure Blob (requires --features cloud-azure; uses DefaultAzureCredential
# or AZURE_STORAGE_CONNECTION_STRING)
v5 semantics:
snapshotholds the database write lock for the duration of the file copy (writers block, readers continue). MVCC in v6 will offer point-in-time consistency without blocking writers. Restore verifies every file's SHA-256 against the manifest and aborts on mismatch.
Performance
Headline numbers from cargo bench --bench engine --bench protocol -- --quick on a
MacBook Pro (Apple Silicon, default build profile, debug-assertions off, single-threaded
synchronous workload). Reproduce with cargo bench.
| Operation | Throughput |
|---|---|
| INSERT small doc (~50 B) | ~235 ops/s |
| INSERT medium doc (~500 B) | ~234 ops/s |
| INSERT large doc (4 KB, overflow) | ~225 ops/s |
| GET by UUID (warm buffer pool) | ~223 K ops/s |
| GET by UUID (cold reopen) | ~217 K ops/s |
| SCAN full collection (10 K docs) | ~2.42 M docs/s |
| Index exact-match query | ~17.7 K ops/s |
| Index range query (~50-key window) | ~836 ranges/s |
| Protocol — parse simple command | ~11.7 M ops/s |
| Protocol — parse 1 KB INSERT | ~6.5 GiB/s |
| Protocol — serialize 100-bulk array | ~9.2 M elem/s |
Each INSERT performs a WAL write + fsync, which dominates write throughput; batching multiple writes into a single transaction (planned in v5) is expected to lift this by ~10×. Reads after the first warm-up are served from the buffer pool.
Full HTML reports land in target/criterion/report/index.html after running
cargo bench.
License
Licensed under either of:
- MIT license (LICENSE-MIT)
- Apache License, Version 2.0 (LICENSE-APACHE)
at your option.