bison-db 1.0.0

An embedded, document-oriented database for Rust - schemaless documents, secondary indexes, and ACID single-file storage, with zero network and zero external services.
Documentation

Available now (v0.4.0):

  • Schemaless documents — store nested, JSON-like documents with no fixed schema, built from a small, typed Value model
  • Single-file storage — the whole database is one file: trivial to ship, copy, and back up
  • Crash-safe writes — every record is length-framed and CRC-32C checked; a write torn by a crash is detected and dropped on the next open, never silently misread
  • Configurable durabilityfsync on every write, or batch and sync on flush; either way the file is never left corrupt
  • Compactioncompact reclaims the space left by overwrites and deletes via a crash-safe atomic file swap
  • Concurrency — single-writer, multi-reader; Db is Send + Sync, so share it across threads behind an Arc<RwLock<Db>>
  • Frozen on-disk format — the format is stable (version 1); files written by 0.2.0 onward stay readable
  • Embedded, zero-network — runs in-process; no server, no daemon, no external services
  • Point operationsinsert, get, update, and delete documents by id, plus flush for durability
  • Secondary indexes — index any number of document fields; queries also work without an index, so an index is a pure speedup
  • Field and range queriesfind by an exact field value, range over an ordered field
  • Optional serde — move documents in and out of JSON, MessagePack, or any serde format

Possible future directions (post-1.0, additive only — the 1.0 API and format do not change):

  • Read cache / memory-mapped reads — close the modest point-read gap to memory-mapped engines
  • Persistent / lazily-rebuilt indexes — avoid re-declaring indexes after reopening, via a sidecar file

Installation

[dependencies]
bison-db = "1.0"

# With serde support for the document model:
bison-db = { version = "1.0", features = ["serde"] }

Quick Start

use bison_db::{Db, Document};

fn main() -> bison_db::Result<()> {
    // The whole database is a single file.
    let mut db = Db::open("library.bison")?;

    // Schemaless: set whatever fields you like, of mixed types.
    let mut album = Document::new();
    album.set("artist", "Miles Davis").set("title", "Kind of Blue").set("year", 1959_i64);

    // Insert returns a stable id; read, overwrite, and delete by it.
    let id = db.insert(album)?;
    let stored = db.get(id)?.expect("just inserted");
    assert_eq!(stored.get("title").and_then(|v| v.as_str()), Some("Kind of Blue"));

    db.update(id, { let mut d = Document::new(); d.set("title", "So What"); d })?;
    assert!(db.delete(id)?);

    db.flush()?; // make recent writes durable
    Ok(())
}

More runnable programs live in examples/: quick_start, user_profiles (CRUD with nested documents), secondary_indexes, durability, compaction, session_store (a realistic indexed store), crash_recovery, and json_interop.

cargo run --example session_store
cargo run --example secondary_indexes
cargo run --example durability
cargo run --example compaction
cargo run --example crash_recovery
cargo run --example json_interop --features serde

Querying

Index any number of fields, then query by exact value or by range. Queries work with or without an index — declaring one only makes them faster.

use bison_db::{Db, Document, Value};

fn main() -> bison_db::Result<()> {
    let mut db = Db::open("people.bison")?;

    for (name, age) in [("ada", 36_i64), ("grace", 45), ("alan", 29)] {
        let mut d = Document::new();
        d.set("name", name).set("age", age);
        db.insert(d)?;
    }

    // Build indexes — there is no cap on how many fields you index.
    db.create_index("name")?;
    db.create_index("age")?;

    // Equality: who is named "ada"?
    let ada = db.find("name", &Value::from("ada"))?;        // Vec<DocId>

    // Range: everyone aged 30..=44 (results ordered by age).
    let thirties_forties = db.range("age", Value::from(30_i64)..=Value::from(44_i64))?;

    assert_eq!(ada.len(), 1);
    assert_eq!(thirties_forties.len(), 1); // ada (36)
    Ok(())
}

Durability

Choose how durable writes are when you open the store. The default, SyncPolicy::Manual, is fastest: writes are crash-safe (a torn write is never misread), but a power loss can lose the most recent writes that were never flushed. SyncPolicy::Always fsyncs after every write, so each one is durable the moment it returns.

use bison_db::{Db, DbOptions, Document, SyncPolicy};

fn main() -> bison_db::Result<()> {
    // Durable per write — every insert/update/delete fsyncs before returning.
    let mut db = Db::open_with("ledger.bison", DbOptions::new().sync(SyncPolicy::Always))?;
    db.insert({ let mut d = Document::new(); d.set("entry", "debit 100"); d })?;
    // No explicit flush needed under Always.
    Ok(())
}

Either way the file is never left corrupt: every record is CRC-checked, a crash torn write at the tail is truncated on the next open, and the on-disk format is frozen and versioned. See docs/FORMAT.md for the byte-level layout.

Reclaiming space

The store is append-only, so overwrites and deletes leave dead records behind. compact rewrites the file with one record per live document and swaps it in atomically. Document ids and secondary indexes are preserved.

# fn main() -> bison_db::Result<()> {
# let mut db = bison_db::Db::open(std::env::temp_dir().join("readme_compact.bison"))?;
let before = db.stats().file_bytes;
db.compact()?;
let after = db.stats().file_bytes; // smaller, with the same live data
# assert!(after <= before);
# Ok(())
# }

Concurrency

Db follows a single-writer, multi-reader model, like an embedded SQL engine: reads take &self, writes take &mut self. Db is Send + Sync, so the idiomatic way to share one across threads is an Arc<RwLock<Db>> — many readers concurrently, or one exclusive writer.

use std::sync::{Arc, RwLock};
use bison_db::{Db, Document};

# fn main() -> bison_db::Result<()> {
let db = Arc::new(RwLock::new(Db::open("shared.bison")?));

// Writer:
db.write().unwrap().insert({ let mut d = Document::new(); d.set("k", 1_i64); d })?;

// Reader on another thread:
let snapshot = Arc::clone(&db);
std::thread::spawn(move || {
    let guard = snapshot.read().unwrap();
    let _ = guard.len();
});
# Ok(())
# }

API Overview

For the complete reference, see docs/API.md.

  • Db / DbOptions — open the store (with a durability policy); insert / get / update / delete / flush / compact; create_index / find / range
  • Document — the ordered, schemaless record you store
  • Value — a field's content: null, bool, int, float, string, bytes, array, or nested document
  • DocId — a document's stable primary key
  • Error — the closed set of failures an operation can return

Performance

bison-db is an in-process store: there is no network hop and no client/server serialization. Writes are appended sequentially to one file (the access pattern storage hardware serves fastest), and a read is a single hash-index lookup followed by one positional read. Indicative single-threaded figures from cargo bench on a developer laptop (Linux, x86_64, Rust 1.95):

Operation Time
insert a small document ~0.9 µs
get a small document ~0.36 µs
update a small document ~0.6 µs
find by indexed field (in a 10k-doc store) ~65 ns
find by full scan (in a 10k-doc store) ~1.8 ms

The last two rows are the same query with and without an index — the index turns a full scan into a B-tree point lookup (~27,000× here). Numbers are produced by benches/bison_bench.rs against a real on-disk store; reproduce them with cargo bench.

In a controlled head-to-head against redb (a pure-Rust ACID embedded engine), bison-db is ~1.85× faster on bulk inserts and ~35% smaller on disk, while redb is ~1.3× faster on point reads — an honest split, with bison-db carrying a richer document model on top. See docs/PERFORMANCE.md for the full method, environment, durability cost, and the reproducible harness in benchmarks/.

Where It Fits

bison-db composes the storage primitives into a document store. It builds on:

  • wal-db — durable write-ahead logging and crash recovery
  • index-db — B+tree secondary indexes over document fields
  • page-db — fixed-size paged storage substrate
  • applications — any Rust app needing a local document store with no server

It is the first crate in the Bison embedded-database family.

Cross-Platform Support

Linux (x86_64, aarch64), macOS (x86_64, Apple Silicon), and Windows (x86_64) are first-class and verified by the CI matrix.

Contributing

See CONTRIBUTING.md and dev/DIRECTIVES.md. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.