vecdb 0.5.7

High-performance mutable persistent vectors built on rawdb
Documentation

vecdb

High-performance mutable persistent vectors built on rawdb.

Features

  • Vec-like API: push, update, truncate, delete by index with sparse holes
  • Multiple storage formats:
    • Raw: BytesVec, ZeroCopyVec (uncompressed)
    • Compressed: PcoVec, LZ4Vec, ZstdVec
  • Computed vectors: EagerVec (stored computations), LazyVecFrom1/2/3 (on-the-fly computation)
  • Rollback support: Time-travel via stamped change deltas without full snapshots
  • Sparse deletions: Delete elements leaving holes, no reindexing required
  • Thread-safe: Concurrent reads with exclusive writes
  • Blazing fast: See benchmarks
  • Lazy persistence: Changes buffered in memory, persisted only on explicit flush()

Not Suited For

  • Key-value storage - Use fjall or redb
  • Variable-sized types - Types like String, Vec<T>, or dynamic structures
  • ACID transactions - No transactional guarantees (use explicit rollback instead)

Install

cargo add vecdb

Quick Start

use vecdb::{
    AnyStoredVec, BytesVec, Database, GenericStoredVec,
    ImportableVec, Result, Version
};
use std::path::Path;

fn main() -> Result<()> {
    // Open database
    let db = Database::open(Path::new("data"))?;

    // Create vector with index type usize and value type u64
    let mut vec: BytesVec<usize, u64> =
        BytesVec::import(&db, "my_vec", Version::TWO)?;

    // Push values (buffered in memory)
    for i in 0..1_000_000 {
        vec.push(i);
    }

    // Flush writes to rawdb region and syncs to disk
    vec.flush()?;  // Calls write() internally then flushes region
    db.flush()?;   // Syncs database metadata

    // Sequential iteration
    let mut sum = 0u64;
    for value in vec.iter()? {
        sum = sum.wrapping_add(value);
    }

    // Random access
    let reader = vec.create_reader();
    for i in [500, 1000, 10] {
        if let Ok(value) = vec.read_at(i, &reader) {
            println!("vec[{}] = {}", i, value);
        }
    }

    Ok(())
}

Type Constraints

vecdb works with fixed-size types:

  • Numeric primitives: u8, i32, f64, etc.
  • Fixed arrays: [T; N]
  • Structs with #[repr(C)]
  • Types implementing zerocopy::FromBytes + zerocopy::AsBytes (for ZeroCopyVec)
  • Types implementing Bytes trait (for BytesVec, LZ4Vec, ZstdVec)
  • Numeric types implementing Pco trait (for PcoVec)

Use #[derive(Bytes)] or #[derive(Pco)] from vecdb_derive to enable custom wrapper types.

Vector Variants

Raw (Uncompressed)

BytesVec<I, T> - Custom serialization via Bytes trait

use vecdb::{BytesVec, Bytes};

#[derive(Bytes)]
struct UserId(u64);

let mut vec: BytesVec<usize, UserId> =
    BytesVec::import(&db, "users", Version::TWO)?;

ZeroCopyVec<I, T> - Zero-copy mmap access (fastest random reads)

use vecdb::ZeroCopyVec;

let mut vec: ZeroCopyVec<usize, u32> =
    ZeroCopyVec::import(&db, "raw", Version::TWO)?;

Compressed

PcoVec<I, T> - Pcodec compression (best for numeric data, excellent compression ratios)

use vecdb::PcoVec;

let mut vec: PcoVec<usize, f64> =
    PcoVec::import(&db, "prices", Version::TWO)?;

LZ4Vec<I, T> - LZ4 compression (fast, general-purpose)

use vecdb::LZ4Vec;

let mut vec: LZ4Vec<usize, [u8; 16]> =
    LZ4Vec::import(&db, "hashes", Version::TWO)?;

ZstdVec<I, T> - Zstd compression (high compression ratio, general-purpose)

use vecdb::ZstdVec;

let mut vec: ZstdVec<usize, u64> =
    ZstdVec::import(&db, "data", Version::TWO)?;

Computed Vectors

EagerVec<V> - Wraps any stored vector to enable eager computation methods

Stores computed results on disk, incrementally updating when source data changes. Use for derived metrics, aggregations, transformations, moving averages, etc.

use vecdb::EagerVec;

let mut derived: EagerVec<BytesVec<usize, f64>> =
    EagerVec::import(&db, "derived", Version::TWO)?;

// Compute methods store results on disk
// derived.compute_add(&source1, &source2)?;
// derived.compute_sma(&source, 20)?;

LazyVecFrom1/2/3<...> - Lazily computed vectors from 1-3 source vectors

Values computed on-the-fly during iteration, nothing stored on disk. Use for temporary views or simple transformations.

use vecdb::LazyVecFrom1;

let lazy = LazyVecFrom1::init(
    "computed",
    Version::TWO,
    source.boxed(),
    |i, source_iter| source_iter.get(i).map(|v| v * 2)
);

// Computed during iteration, not stored
for value in lazy.iter() {
    // ...
}

Core Operations

Write and Persistence

// Push values (buffered in memory)
vec.push(42);
vec.push(100);

// write() moves pushed values to storage (visible for reads)
vec.write()?;

// flush() calls write() + region().flush() for durability
vec.flush()?;
db.flush()?;   // Also flush database metadata

Updates and Deletions

// Update element at index (works on stored data)
vec.update(5, 999)?;

// Delete element (creates a hole at that index)
let reader = vec.create_reader();
vec.take(10, &reader)?;
drop(reader);

// Holes are tracked and can be checked
if vec.holes().contains(&10) {
    println!("Index 10 is a hole");
}

// Reading a hole returns None
let reader = vec.create_reader();
assert_eq!(vec.get_any_or_read(10, &reader)?, None);

Rollback with Stamps

Rollback uses stamped change deltas - lightweight compared to full snapshots.

use vecdb::Stamp;

// Create initial state
vec.push(100);
vec.push(200);
vec.stamped_write_with_changes(Stamp::new(1))?;

// Make more changes
vec.push(300);
vec.update(0, 999)?;
vec.stamped_write_with_changes(Stamp::new(2))?;

// Rollback to previous stamp (undoes changes from stamp 2)
vec.rollback()?;
assert_eq!(vec.stamp(), Stamp::new(1));

// Rollback before a stamp (undoes everything including stamp 1)
vec.rollback_before(Stamp::new(1))?;
assert_eq!(vec.stamp(), Stamp::new(0));

Configure number of stamps to keep:

let options = (&db, "vec", Version::TWO)
    .into()
    .with_saved_stamped_changes(10);  // Keep last 10 stamps
let vec = BytesVec::import_with(options)?;

When To Use

Perfect for:

  • Storing large Vecs persistently on disk
  • Append-only or append-mostly workloads
  • High-speed sequential reads
  • High-speed random reads (improved with ZeroCopyVec)
  • Space-efficient storage for numeric time series (improved with PcoVec)
  • Sparse deletions without reindexing
  • Lightweight rollback without full snapshots
  • Derived computations stored on disk (with EagerVec)

Not ideal for:

  • Heavy random write workloads
  • Frequent insertions in the middle
  • Variable-length data (strings, nested vectors)
  • ACID transaction requirements
  • Key-value lookups (use a proper key-value store)

Feature Flags

No features are enabled by default. Enable only what you need:

cargo add vecdb  # BytesVec only, no compression or optional features

Available features:

  • pco - Pcodec compression support (PcoVec)
  • zerocopy - Zero-copy mmap access (ZeroCopyVec)
  • lz4 - LZ4 compression support (LZ4Vec)
  • zstd - Zstd compression support (ZstdVec)
  • derive - Derive macros for Bytes and Pco traits
  • serde - Serde serialization support
  • serde_json - JSON output using serde_json
  • sonic-rs - Faster JSON using sonic-rs

With Pcodec compression:

cargo add vecdb --features pco,derive

With all compression formats:

cargo add vecdb --features pco,zerocopy,lz4,zstd,derive

Examples

Comprehensive examples in examples/:

Run examples:

cargo run --example zerocopy --features zerocopy
cargo run --example pcodec --features pco

Performance

See vecdb_bench for detailed benchmarks.

vecdb is significantly faster than general-purpose embedded databases for fixed-size data workloads.