Expand description
§[vecdb]
A KISS (Keep It Simple, Stupid) index-value storage engine optimized for columnar data with transparent compression support.
§Overview
VecDB is an embedded database engine designed for high-performance columnar storage. It provides vector-like data structures that can be persisted to disk with optional compression, making it ideal for analytical workloads and time-series data.
§Key Features
- Columnar storage: Optimized for analytical queries and data compression
- Embedded: No separate server process - runs directly in your application
- Index-free: Uses array indices as keys, eliminating key storage overhead
- Value-focused: Only actual values are stored, maximizing space efficiency
- Dual storage modes: Choose between raw (fast access) or compressed (space efficient) storage
- Transactional: ACID-compliant operations with proper isolation
- Multi-reader/writer: Concurrent access support with fine-grained locking
- Performance-optimized: Non-portable design choices for maximum speed on supported platforms
- Unix-focused: Primarily designed for Unix-like systems
§Storage Variants
VecDB supports multiple vector implementations for different use cases:
§Raw Vectors (RawVec
)
- Direct, uncompressed storage for maximum read/write speed
- Ideal for frequently accessed data and real-time applications
§Compressed Vectors (CompressedVec
)
- Advanced compression using
pco
(Pcodec) for numerical data - Significant space savings with acceptable performance trade-offs
- Perfect for analytical workloads and archival data
§Computed Vectors
- On-the-fly computation from other vectors
- Lazy evaluation for derived data sets
- Support for 1-3 input vector computations
§Eager/Lazy Variants
- Different loading and caching strategies
- Optimized for various memory and performance constraints
§Example Usage
§Raw Storage
use std::{path::Path, sync::Arc};
use vecdb::{RawVec, Database, Version};
let database = Database::open(Path::new("data"))?;
let mut vec: RawVec<usize, u32> = RawVec::forced_import(&database, "my_vec", Version::TWO)?;
// Push values
vec.push(42);
vec.push(84);
// Read values
let reader = vec.create_reader();
let value = vec.get_or_read(0, &reader)?; // Returns Result<Option<Cow<u32>>>
// Persist to disk
vec.flush()?;
§Compressed Storage
use vecdb::{CompressedVec, Database, Version};
let database = Database::open(Path::new("data"))?;
let mut vec: CompressedVec<usize, u32> = CompressedVec::forced_import(&database, "compressed_vec", Version::TWO)?;
// Same API as raw vectors, but with compression
vec.push(1000);
vec.flush()?;
§Architecture
VecDB is built on top of SeqDB for low-level storage management and provides:
- Type-safe interfaces: Generic vector types with compile-time type checking
- Versioning system: Schema evolution and backward compatibility
- Stamping mechanism: Track data freshness and updates
- Hole management: Efficient handling of deleted elements
- Iterator support: Standard Rust iterator patterns for data access
§Use Cases
- Time-series databases
- Analytical data processing
- Scientific computing datasets
- Financial market data
- IoT sensor data storage
- Any scenario requiring fast columnar access patterns
VecDB excels when you need the performance of in-memory data structures with the durability of persistent storage.
§Examples
§Raw
use std::{borrow::Cow, collections::BTreeSet, fs, path::Path};
use vecdb::{
AnyStoredVec, AnyVec, CollectableVec, Database, GenericStoredVec, RawVec, Stamp, VecIterator,
Version,
};
#[allow(clippy::upper_case_acronyms)]
type VEC = RawVec<usize, u32>;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let _ = fs::remove_dir_all("raw");
let version = Version::TWO;
let database = Database::open(Path::new("raw"))?;
{
let mut vec: VEC = RawVec::forced_import(&database, "vec", version)?;
(0..21_u32).for_each(|v| {
vec.push(v);
});
let mut iter = vec.into_iter();
assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
assert!(iter.get(1) == Some(Cow::Borrowed(&1)));
assert!(iter.get(2) == Some(Cow::Borrowed(&2)));
assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
assert!(iter.get(21).is_none());
drop(iter);
vec.flush()?;
assert!(vec.header().stamp() == Stamp::new(0));
}
{
let mut vec: VEC = RawVec::forced_import(&database, "vec", version)?;
vec.mut_header().update_stamp(Stamp::new(100));
assert!(vec.header().stamp() == Stamp::new(100));
let mut iter = vec.into_iter();
assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
assert!(iter.get(1) == Some(Cow::Borrowed(&1)));
assert!(iter.get(2) == Some(Cow::Borrowed(&2)));
assert!(iter.get(3) == Some(Cow::Borrowed(&3)));
assert!(iter.get(4) == Some(Cow::Borrowed(&4)));
assert!(iter.get(5) == Some(Cow::Borrowed(&5)));
assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
drop(iter);
vec.push(21);
vec.push(22);
assert!(vec.stored_len() == 21);
assert!(vec.pushed_len() == 2);
assert!(vec.len() == 23);
let mut iter = vec.into_iter();
assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
assert!(iter.get(21) == Some(Cow::Borrowed(&21)));
assert!(iter.get(22) == Some(Cow::Borrowed(&22)));
assert!(iter.get(23).is_none());
drop(iter);
vec.flush()?;
}
{
let mut vec: VEC = RawVec::forced_import(&database, "vec", version)?;
assert!(vec.header().stamp() == Stamp::new(100));
assert!(vec.stored_len() == 23);
assert!(vec.pushed_len() == 0);
assert!(vec.len() == 23);
let mut iter = vec.into_iter();
assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
assert!(iter.get(21) == Some(Cow::Borrowed(&21)));
assert!(iter.get(22) == Some(Cow::Borrowed(&22)));
drop(iter);
vec.truncate_if_needed(14)?;
assert_eq!(vec.stored_len(), 14);
assert_eq!(vec.pushed_len(), 0);
assert_eq!(vec.len(), 14);
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
assert_eq!(iter.get(20), None);
drop(iter);
assert_eq!(
vec.collect_signed_range(Some(-5), None)?,
vec![9, 10, 11, 12, 13]
);
vec.push(vec.len() as u32);
assert_eq!(
VecIterator::last(vec.into_iter()),
Some((14, Cow::Borrowed(&14)))
);
assert_eq!(
vec.into_iter()
.map(|(_, v)| v.into_owned())
.collect::<Vec<_>>(),
vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
);
vec.flush()?;
}
{
let mut vec: VEC = RawVec::forced_import(&database, "vec", version)?;
assert_eq!(
VecIterator::last(vec.into_iter()),
Some((14, Cow::Borrowed(&14)))
);
assert_eq!(
vec.into_iter()
.map(|(_, v)| v.into_owned())
.collect::<Vec<_>>(),
vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
);
vec.reset()?;
assert_eq!(vec.pushed_len(), 0);
assert_eq!(vec.stored_len(), 0);
assert_eq!(vec.len(), 0);
(0..21_u32).for_each(|v| {
vec.push(v);
});
assert_eq!(vec.pushed_len(), 21);
assert_eq!(vec.stored_len(), 0);
assert_eq!(vec.len(), 21);
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert!(iter.get(21).is_none());
drop(iter);
let reader = vec.create_static_reader();
assert_eq!(vec.take(10, &reader)?, Some(10));
assert_eq!(vec.holes(), &BTreeSet::from([10]));
assert!(vec.get_or_read(10, &reader)?.is_none());
drop(reader);
vec.flush()?;
assert!(vec.holes() == &BTreeSet::from([10]));
}
{
let mut vec: VEC = RawVec::forced_import(&database, "vec", version)?;
assert!(vec.holes() == &BTreeSet::from([10]));
let reader = vec.create_static_reader();
assert!(vec.get_or_read(10, &reader)?.is_none());
drop(reader);
vec.update(10, 10)?;
vec.update(0, 10)?;
let reader = vec.create_static_reader();
assert_eq!(vec.holes(), &BTreeSet::new());
assert_eq!(vec.get_or_read(0, &reader)?, Some(Cow::Borrowed(&10)));
assert_eq!(vec.get_or_read(10, &reader)?, Some(Cow::Borrowed(&10)));
drop(reader);
vec.flush()?;
}
{
let vec: VEC = RawVec::forced_import(&database, "vec", version)?;
assert_eq!(
vec.collect()?,
vec![
10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
]
);
}
Ok(())
}
§Compressed
use std::{borrow::Cow, collections::BTreeSet, fs, path::Path};
use vecdb::{
AnyStoredVec, AnyVec, CollectableVec, CompressedVec, Database, GenericStoredVec, Stamp,
VecIterator, Version,
};
#[allow(clippy::upper_case_acronyms)]
type VEC = CompressedVec<usize, u32>;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let _ = fs::remove_dir_all("compressed");
let version = Version::TWO;
let database = Database::open(Path::new("compressed"))?;
{
let mut vec: VEC = CompressedVec::forced_import(&database, "vec", version)?;
(0..21_u32).for_each(|v| {
vec.push(v);
});
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(1), Some(Cow::Borrowed(&1)));
assert_eq!(iter.get(2), Some(Cow::Borrowed(&2)));
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert_eq!(iter.get(21), None);
drop(iter);
vec.flush()?;
assert_eq!(vec.header().stamp(), Stamp::new(0));
}
{
let mut vec: VEC = CompressedVec::forced_import(&database, "vec", version)?;
vec.mut_header().update_stamp(Stamp::new(100));
assert!(vec.header().stamp() == Stamp::new(100));
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(1), Some(Cow::Borrowed(&1)));
assert_eq!(iter.get(2), Some(Cow::Borrowed(&2)));
assert_eq!(iter.get(3), Some(Cow::Borrowed(&3)));
assert_eq!(iter.get(4), Some(Cow::Borrowed(&4)));
assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
drop(iter);
vec.push(21);
vec.push(22);
assert_eq!(vec.stored_len(), 21);
assert_eq!(vec.pushed_len(), 2);
assert_eq!(vec.len(), 23);
let mut iter = vec.into_iter();
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert_eq!(iter.get(21), Some(Cow::Borrowed(&21)));
assert_eq!(iter.get(22), Some(Cow::Borrowed(&22)));
assert_eq!(iter.get(23), None);
drop(iter);
vec.flush()?;
}
{
let mut vec: VEC = CompressedVec::forced_import(&database, "vec", version)?;
assert_eq!(vec.header().stamp(), Stamp::new(100));
assert_eq!(vec.stored_len(), 23);
assert_eq!(vec.pushed_len(), 0);
assert_eq!(vec.len(), 23);
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert_eq!(iter.get(21), Some(Cow::Borrowed(&21)));
assert_eq!(iter.get(22), Some(Cow::Borrowed(&22)));
drop(iter);
vec.truncate_if_needed(14)?;
assert_eq!(vec.stored_len(), 14);
assert_eq!(vec.pushed_len(), 0);
assert_eq!(vec.len(), 14);
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
assert_eq!(iter.get(20), None);
drop(iter);
assert_eq!(
vec.collect_signed_range(Some(-5), None)?,
vec![9, 10, 11, 12, 13]
);
vec.push(vec.len() as u32);
assert_eq!(
VecIterator::last(vec.into_iter()),
Some((14, Cow::Borrowed(&14)))
);
vec.flush()?;
assert_eq!(
vec.into_iter()
.map(|(_, v)| v.into_owned())
.collect::<Vec<_>>(),
vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
);
}
{
let mut vec: VEC = CompressedVec::forced_import(&database, "vec", version)?;
assert_eq!(
vec.into_iter()
.map(|(_, v)| v.into_owned())
.collect::<Vec<_>>(),
vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
);
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
assert_eq!(iter.get(20), None);
drop(iter);
assert_eq!(
vec.collect_signed_range(Some(-5), None)?,
vec![10, 11, 12, 13, 14]
);
vec.reset()?;
assert_eq!(vec.pushed_len(), 0);
assert_eq!(vec.stored_len(), 0);
assert_eq!(vec.len(), 0);
(0..21_u32).for_each(|v| {
vec.push(v);
});
assert_eq!(vec.pushed_len(), 21);
assert_eq!(vec.stored_len(), 0);
assert_eq!(vec.len(), 21);
let mut iter = vec.into_iter();
assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
assert_eq!(iter.get(21), None);
drop(iter);
vec.flush()?;
}
{
let mut vec: VEC = CompressedVec::forced_import(&database, "vec", version)?;
assert_eq!(vec.pushed_len(), 0);
assert_eq!(vec.stored_len(), 21);
assert_eq!(vec.len(), 21);
let reader = vec.create_static_reader();
assert_eq!(vec.holes(), &BTreeSet::new());
assert_eq!(vec.get_or_read(0, &reader)?, Some(Cow::Borrowed(&0)));
assert_eq!(vec.get_or_read(10, &reader)?, Some(Cow::Borrowed(&10)));
drop(reader);
vec.flush()?;
}
{
let vec: VEC = CompressedVec::forced_import(&database, "vec", version)?;
assert!(
vec.collect()?
== vec![
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
]
);
}
Ok(())
}
Structs§
Enums§
Traits§
- AnyCloneable
Iterable Vec - AnyCollectable
Vec - AnyIterable
Vec - AnyStored
Iterable Vec - AnyStored
Vec - AnyVec
- AsInner
Slice - Base
VecIterator - Checked
Sub - Collectable
Vec - From
Coarser Index - From
Inner Slice - Generic
Stored Vec - Printable
- Stored
Compressed - Stored
Index - Stored
Raw - Transparent
Stored Compressed - VecIterator