Expand description
§vecdb
High-performance mutable persistent vectors built on rawdb.
§Features
- Vec-like API:
push,update,truncate, delete by index with sparse holes - Multiple storage formats:
- Raw:
BytesVec,ZeroCopyVec(uncompressed) - Compressed:
PcoVec,LZ4Vec,ZstdVec
- Raw:
- Computed vectors:
EagerVec(stored computations),LazyVecFrom1/2/3(on-the-fly computation) - Rollback support: Time-travel via stamped change deltas without full snapshots
- Sparse deletions: Delete elements leaving holes, no reindexing required
- Thread-safe: Concurrent reads with exclusive writes
- Blazing fast: See benchmarks
- Lazy persistence: Changes buffered in memory, persisted only on explicit
flush()
§Not Suited For
- Key-value storage - Use
fjallorredb - Variable-sized types - Types like
String,Vec<T>, or dynamic structures - ACID transactions - No transactional guarantees (use explicit rollback instead)
§Install
cargo add vecdb§Quick Start
use vecdb::{
AnyStoredVec, AnyVec, BytesVec, Database, WritableVec,
ImportableVec, ReadableVec, Result, Version
};
use std::path::Path;
fn main() -> Result<()> {
// Open database
let db = Database::open(Path::new("data"))?;
// Create vector with index type usize and value type u64
let mut vec: BytesVec<usize, u64> =
BytesVec::import(&db, "my_vec", Version::TWO)?;
// Push values (buffered in memory)
for i in 0..1_000_000 {
vec.push(i);
}
// Flush writes to rawdb region and syncs to disk
vec.flush()?; // Calls write() internally then flushes region
db.flush()?; // Syncs database metadata
// Sequential scan via fold
let sum = vec.fold_range(0, vec.len(), 0u64, |acc, v| acc.wrapping_add(v));
// Random access via reader
let reader = vec.reader();
for i in [500, 1000, 10] {
println!("vec[{}] = {}", i, reader.get(i));
}
Ok(())
}§Type Constraints
vecdb works with fixed-size types:
- Numeric primitives:
u8,i32,f64, etc. - Fixed arrays:
[T; N] - Structs with
#[repr(C)] - Types implementing
zerocopy::FromBytes + zerocopy::AsBytes(forZeroCopyVec) - Types implementing
Bytestrait (forBytesVec,LZ4Vec,ZstdVec) - Numeric types implementing
Pcotrait (forPcoVec)
Use #[derive(Bytes)] or #[derive(Pco)] from vecdb_derive to enable custom wrapper types.
§Vector Variants
§Raw (Uncompressed)
BytesVec<I, T> - Custom serialization via Bytes trait
use vecdb::{BytesVec, Bytes};
#[derive(Bytes)]
struct UserId(u64);
let mut vec: BytesVec<usize, UserId> =
BytesVec::import(&db, "users", Version::TWO)?;ZeroCopyVec<I, T> - Zero-copy mmap access (fastest random reads)
use vecdb::ZeroCopyVec;
let mut vec: ZeroCopyVec<usize, u32> =
ZeroCopyVec::import(&db, "raw", Version::TWO)?;§Compressed
PcoVec<I, T> - Pcodec compression (best for numeric data, excellent compression ratios)
use vecdb::PcoVec;
let mut vec: PcoVec<usize, f64> =
PcoVec::import(&db, "prices", Version::TWO)?;LZ4Vec<I, T> - LZ4 compression (fast, general-purpose)
use vecdb::LZ4Vec;
let mut vec: LZ4Vec<usize, [u8; 16]> =
LZ4Vec::import(&db, "hashes", Version::TWO)?;ZstdVec<I, T> - Zstd compression (high compression ratio, general-purpose)
use vecdb::ZstdVec;
let mut vec: ZstdVec<usize, u64> =
ZstdVec::import(&db, "data", Version::TWO)?;§Computed Vectors
EagerVec<V> - Wraps any stored vector to enable eager computation methods
Stores computed results on disk, incrementally updating when source data changes. Use for derived metrics, aggregations, transformations, moving averages, etc.
use vecdb::EagerVec;
let mut derived: EagerVec<BytesVec<usize, f64>> =
EagerVec::import(&db, "derived", Version::TWO)?;
// Compute methods store results on disk
// derived.compute_add(&source1, &source2)?;
// derived.compute_sma(&source, 20)?;LazyVecFrom1/2/3<...> - Lazily computed vectors from 1-3 source vectors
Values computed on-the-fly during iteration, nothing stored on disk. Use for temporary views or simple transformations.
use vecdb::LazyVecFrom1;
let lazy = LazyVecFrom1::init(
"computed",
Version::TWO,
Box::new(source.clone()), // ScannableBoxedVec
|_i, v| v * 2,
);
// Computed on-the-fly via ReadableVec trait, not stored
lazy.for_each(|value| {
// ...
});§Core Operations
§Write and Persistence
// Push values (buffered in memory)
vec.push(42);
vec.push(100);
// write() moves pushed values to storage (visible for reads)
vec.write()?;
// flush() calls write() + region().flush() for durability
vec.flush()?;
db.flush()?; // Also flush database metadata§Updates and Deletions
// Update element at index (works on stored data)
vec.update(5, 999)?;
// Delete element (creates a hole at that index)
let reader = vec.create_reader();
vec.take(10, &reader)?;
drop(reader);
// Holes are tracked and can be checked
if vec.holes().contains(&10) {
println!("Index 10 is a hole");
}
// Reading a hole returns None
let reader = vec.create_reader();
assert_eq!(vec.get_any_or_read(10, &reader)?, None);§Rollback with Stamps
Rollback uses stamped change deltas - lightweight compared to full snapshots.
use vecdb::Stamp;
// Create initial state
vec.push(100);
vec.push(200);
vec.stamped_write_with_changes(Stamp::new(1))?;
// Make more changes
vec.push(300);
vec.update(0, 999)?;
vec.stamped_write_with_changes(Stamp::new(2))?;
// Rollback to previous stamp (undoes changes from stamp 2)
vec.rollback()?;
assert_eq!(vec.stamp(), Stamp::new(1));
// Rollback before a stamp (undoes everything including stamp 1)
vec.rollback_before(Stamp::new(1))?;
assert_eq!(vec.stamp(), Stamp::new(0));Configure number of stamps to keep:
let options = (&db, "vec", Version::TWO)
.into()
.with_saved_stamped_changes(10); // Keep last 10 stamps
let vec = BytesVec::import_with(options)?;§When To Use
Perfect for:
- Storing large
Vecs persistently on disk - Append-only or append-mostly workloads
- High-speed sequential reads
- High-speed random reads (improved with
ZeroCopyVec) - Space-efficient storage for numeric time series (improved with
PcoVec) - Sparse deletions without reindexing
- Lightweight rollback without full snapshots
- Derived computations stored on disk (with
EagerVec)
Not ideal for:
- Heavy random write workloads
- Frequent insertions in the middle
- Variable-length data (strings, nested vectors)
- ACID transaction requirements
- Key-value lookups (use a proper key-value store)
§Feature Flags
No features are enabled by default. Enable only what you need:
cargo add vecdb # BytesVec only, no compression or optional featuresAvailable features:
pco- Pcodec compression support (PcoVec)zerocopy- Zero-copy mmap access (ZeroCopyVec)lz4- LZ4 compression support (LZ4Vec)zstd- Zstd compression support (ZstdVec)derive- Derive macros forBytesandPcotraitsserde- Serde serialization supportserde_json- JSON output using serde_jsonsonic-rs- Faster JSON using sonic-rs
With Pcodec compression:
cargo add vecdb --features pco,deriveWith all compression formats:
cargo add vecdb --features pco,zerocopy,lz4,zstd,derive§Examples
Comprehensive examples in examples/:
zerocopy.rs- ZeroCopyVec with holes, updates, and rollbackpcodec.rs- PcoVec with compression
Run examples:
cargo run --example zerocopy --features zerocopy
cargo run --example pcodec --features pco§Benchmarks
10B sequential
u64values (80 GB), Apple Silicon,--release. Compression ratios reflect sequential data — real-world ratios will vary.
| Type | Disk | Write | Read |
|---|---|---|---|
BytesVec | 80.0 GB | 1.8 GB/s | 6.7 GB/s |
ZeroCopyVec | 80.0 GB | 1.7 GB/s | 6.7 GB/s |
PcoVec | 181 MB | 0.4 GB/s | 7.7 GB/s |
LZ4Vec | 40.1 GB | 0.4 GB/s | 3.0 GB/s |
ZstdVec | 10.4 GB | 0.5 GB/s | 1.0 GB/s |
cargo run --release --example bench -p vecdb --features pco,lz4,zstd,zerocopy
BENCH_COUNT=100_000_000 cargo run ... # smaller dataset§PcoVec SIMD (x86_64)
For best PcoVec decompression on x86_64, enable BMI and AVX2:
RUSTFLAGS="-C target-feature=+bmi1,+bmi2,+avx2" cargo build --releaseStructs§
- Bytes
Strategy - Serialization strategy using the Bytes trait with portable byte order.
- Bytes
Vec - Raw storage vector using explicit byte serialization in little-endian format.
- Cached
Vec - Cached snapshot of a readable vec, refreshed when len or version changes.
- Cursor
- Buffered reader that reuses an internal buffer across chunked
read_into_atcalls. - Database
- Memory-mapped database with region-based storage and hole punching.
- Delta
Avg - Rolling average from cumulative:
(cum[h] - cum[start - 1]) / (h - start + 1) - Delta
Change - Delta change:
source[h] - source[start]via f64, allowing cross-type (unsigned → signed). - Delta
Rate - Delta rate (growth):
(source[h] - source[start]) / source[start]via f64. - Delta
Sub - Rolling sum from cumulative:
cum[h] - cum[start - 1] - Divide
- (a, b) -> a / b
- Eager
Vec - Wrapper for computing and storing derived values from source vectors.
- Exit
- Graceful shutdown coordinator for ensuring data consistency during program exit.
- Exit
Guard - Owned read guard for
Exit. Can be moved across threads. - Halve
- v -> v / 2
- Header
- Ident
- v -> v
- Import
Options - Options for importing or creating stored vectors.
- LZ4Strategy
- LZ4 compression strategy for fast compression/decompression.
- LZ4Vec
- Compressed storage using LZ4 for speed-optimized general-purpose compression.
- Lazy
AggVec - Lazy aggregation vector that maps coarser output indices to ranges in a finer source.
- Lazy
Delta Vec - Lazily computed vector that combines a source value with a lookback value.
- Lazy
VecFrom1 - Lazily computed vector deriving values on-the-fly from one source vector.
- Lazy
VecFrom2 - Lazily computed vector deriving values from two source vectors.
- Lazy
VecFrom3 - Lazily computed vector deriving values from three source vectors.
- Minus
- (a, b) -> a - b
- Negate
- v -> -v
- PcoVec
- Compressed storage using Pcodec for optimal numeric data compression.
- Pcodec
Strategy - Pcodec compression strategy for numerical data.
- Plus
- (a, b) -> a + b
- Read
Only Compressed Vec - Lean read-only view of a compressed vector (~48 bytes).
- Read
Only RawVec - Lean read-only view of a raw vector (~40 bytes).
- Read
Write RawVec - Core implementation for raw storage vectors shared by BytesVec and ZeroCopyVec.
- Reader
- Zero-copy reader with a snapshot of region start/len.
- Ro
- Read-only mode.
Stored<V>isV::ReadOnly— a lean clone for disk reads. - Rw
- Read-write mode.
Stored<V>is the identity — the full read-write vec. - Shared
Len - Atomic length counter shared across clones.
- Stamp
- Marker for tracking when data was last modified.
- Times
- (a, b) -> a * b
- VecIterator
Writer - Iterator-backed writer that formats values as CSV.
- VecReader
- Read-only random-access handle into a raw vector’s stored data.
- Version
- Version tracking for data schema and computed values.
- With
Prev - Tracks current and previous values for rollback support.
- Zero
Copy Strategy - Serialization strategy using zerocopy for native byte order access.
- Zero
Copy Vec - Raw storage vector using zerocopy for direct memory mapping in native byte order.
- Zstd
Strategy - Zstd compression strategy for high compression ratios.
- ZstdVec
- Compressed storage using Zstd for maximum general-purpose compression.
Enums§
- Error
- Error types for vecdb operations.
- Format
- Storage format selection for stored vectors.
- RawDB
Error - Error types for rawdb operations.
Constants§
- HEADER_
OFFSET - PAGE_
SIZE - READ_
CHUNK_ SIZE - Default chunk size for chunked iteration (matches PcoVec page size).
Traits§
- AggFold
- Aggregation strategy for [
LazyAggVec]. - AnyExportable
Vec - Type-erased trait for vectors that are both writable and serializable.
This trait is automatically implemented for any type that implements both
AnyVecWithWriterandAnySerializableVec. - AnyReadable
Vec - Type-erased trait for collectable vectors.
- AnySerializable
Vec - Type-erased trait for serializable vectors.
- AnyStored
Vec - Trait for stored vectors that persist data to disk (as opposed to lazy computed vectors).
- AnyVec
- Common trait for all vectors providing metadata and utility methods.
- AnyVec
With Schema - Trait for vectors whose value type implements JsonSchema. Provides access to the JSON Schema of the value type.
- AnyVec
With Writer - AsInner
Slice - Convert a slice of PcoVecValue to a slice of the underlying Number type.
- AsInner
Slice Mut - Convert a mutable slice of PcoVecValue to a mutable slice of the underlying Number type.
- Binary
Transform - Trait for binary transforms applied lazily during iteration. Zero-sized types implementing this get monomorphized (zero runtime cost).
- Bytes
- Trait for types that can be serialized to/from bytes with explicit byte order.
- Bytes
VecValue - Value trait for BytesVec. Extends RawVecValue with Bytes trait for custom serialization.
- Cached
VecBudget - Budget gate for
CachedVecmaterialization. - Checked
Sub - Compression
Strategy - Trait for compression strategies used by ReadWriteCompressedVec.
- DeltaOp
- Trait defining how to combine a current value with an earlier value.
- Formattable
- From
Inner Slice - Convert a Vec of Number type to a Vec of PcoVecValue.
- Importable
Vec - Trait for types that can be imported from a database.
- LZ4Vec
Value - Value trait for LZ4Vec. Extends VecValue with Bytes trait for byte serialization.
- Pco
- PcoVec
Value - Printable
Index - Provides string representations of index types for display and region naming.
- RawStrategy
- Strategy for raw (uncompressed) storage vectors.
- Read
Only Clone - Trait for creating read-only clones of composite types.
- Readable
Cloneable Vec - Trait for readable vectors that can be cloned as trait objects.
- Readable
Option Vec - Extension methods for
ReadableVec<I, Option<T>>. - Readable
Vec - High-performance reading of vector values.
- Saturating
Add - Storage
Mode - Marker trait that selects between read-write and read-only storage.
- Stored
Vec - Super trait combining all common stored vec traits.
- Transparent
Pco - Typed
Vec - A vector with statically-known index and value types.
- Unary
Transform - Trait for unary transforms applied lazily during iteration. Zero-sized types implementing this get monomorphized (zero runtime cost).
- Value
Strategy - Value serialization strategy shared by all vec types (raw and compressed).
- Value
Writer - Stateful writer for streaming values one at a time to a string buffer.
- VecIndex
- Trait for types that can be used as vector indices.
- VecValue
- Marker trait for types that can be stored as values in a vector.
- Writable
Vec - Typed interface for stored vectors (push, truncate, rollback).
- Zero
Copy VecValue - Value trait for ZeroCopyVec. Extends RawVecValue with zerocopy bounds for direct memory mapping.
- Zstd
VecValue - Value trait for ZstdVec. Extends VecValue with Bytes trait for byte serialization.
Functions§
- i64_
to_ usize - Converts an i64 index to usize, supporting negative indexing. Negative indices count from the end.
- likely
- short_
type_ name - Extracts the short type name from a full type path and caches it.
- unlikely
- vec_
region_ name - vec_
region_ name_ with
Type Aliases§
- Bytes
VecReader - Compute
From1 - Compute
From2 - Compute
From3 - Readable
Boxed Vec - Type alias for boxed read-only vectors.
- Result