Crate vecdb

Source
Expand description

§[vecdb]

A KISS (Keep It Simple, Stupid) index-value storage engine optimized for columnar data with transparent compression support.

§Overview

VecDB is an embedded database engine designed for high-performance columnar storage. It provides vector-like data structures that can be persisted to disk with optional compression, making it ideal for analytical workloads and time-series data.

§Key Features

  • Columnar storage: Optimized for analytical queries and data compression
  • Embedded: No separate server process - runs directly in your application
  • Index-free: Uses array indices as keys, eliminating key storage overhead
  • Value-focused: Only actual values are stored, maximizing space efficiency
  • Dual storage modes: Choose between raw (fast access) or compressed (space efficient) storage
  • Transactional: ACID-compliant operations with proper isolation
  • Multi-reader/writer: Concurrent access support with fine-grained locking
  • Performance-optimized: Non-portable design choices for maximum speed on supported platforms
  • Unix-focused: Primarily designed for Unix-like systems

§Storage Variants

VecDB supports multiple vector implementations for different use cases:

§Raw Vectors (RawVec)

  • Direct, uncompressed storage for maximum read/write speed
  • Ideal for frequently accessed data and real-time applications

§Compressed Vectors (CompressedVec)

  • Advanced compression using pco (Pcodec) for numerical data
  • Significant space savings with acceptable performance trade-offs
  • Perfect for analytical workloads and archival data

§Computed Vectors

  • On-the-fly computation from other vectors
  • Lazy evaluation for derived data sets
  • Support for 1-3 input vector computations

§Eager/Lazy Variants

  • Different loading and caching strategies
  • Optimized for various memory and performance constraints

§Example Usage

§Raw Storage

use std::{path::Path, sync::Arc};
use vecdb::{RawVec, Database, Version};

let database = Database::open(Path::new("data"))?;
let mut vec: RawVec<usize, u32> = RawVec::forced_import(&database, "my_vec", Version::TWO)?;

// Push values
vec.push(42);
vec.push(84);

// Read values
let reader = vec.create_reader();
let value = vec.get_or_read(0, &reader)?; // Returns Result<Option<Cow<u32>>>

// Persist to disk
vec.flush()?;

§Compressed Storage

use vecdb::{CompressedVec, Database, Version};

let database = Database::open(Path::new("data"))?;
let mut vec: CompressedVec<usize, u32> = CompressedVec::forced_import(&database, "compressed_vec", Version::TWO)?;

// Same API as raw vectors, but with compression
vec.push(1000);
vec.flush()?;

§Architecture

VecDB is built on top of SeqDB for low-level storage management and provides:

  • Type-safe interfaces: Generic vector types with compile-time type checking
  • Versioning system: Schema evolution and backward compatibility
  • Stamping mechanism: Track data freshness and updates
  • Hole management: Efficient handling of deleted elements
  • Iterator support: Standard Rust iterator patterns for data access

§Use Cases

  • Time-series databases
  • Analytical data processing
  • Scientific computing datasets
  • Financial market data
  • IoT sensor data storage
  • Any scenario requiring fast columnar access patterns

VecDB excels when you need the performance of in-memory data structures with the durability of persistent storage.

§Examples

§Raw

use std::{borrow::Cow, collections::BTreeSet, fs, path::Path};

use vecdb::{
    AnyStoredVec, AnyVec, CollectableVec, Database, GenericStoredVec, RawVec, Stamp, VecIterator,
    Version,
};

#[allow(clippy::upper_case_acronyms)]
type VEC = RawVec<usize, u32>;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let _ = fs::remove_dir_all("raw");

    let version = Version::TWO;

    let database = Database::open(Path::new("raw"))?;

    let mut options = (&database, "vec", version).into();

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        (0..21_u32).for_each(|v| {
            vec.push(v);
        });

        let mut iter = vec.into_iter();
        assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
        assert!(iter.get(1) == Some(Cow::Borrowed(&1)));
        assert!(iter.get(2) == Some(Cow::Borrowed(&2)));
        assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
        assert!(iter.get(21).is_none());
        drop(iter);

        vec.flush()?;

        assert!(vec.header().stamp() == Stamp::new(0));
    }

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        vec.mut_header().update_stamp(Stamp::new(100));

        assert!(vec.header().stamp() == Stamp::new(100));

        let mut iter = vec.into_iter();
        assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
        assert!(iter.get(1) == Some(Cow::Borrowed(&1)));
        assert!(iter.get(2) == Some(Cow::Borrowed(&2)));
        assert!(iter.get(3) == Some(Cow::Borrowed(&3)));
        assert!(iter.get(4) == Some(Cow::Borrowed(&4)));
        assert!(iter.get(5) == Some(Cow::Borrowed(&5)));
        assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
        assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
        assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
        drop(iter);

        vec.push(21);
        vec.push(22);

        assert!(vec.stored_len() == 21);
        assert!(vec.pushed_len() == 2);
        assert!(vec.len() == 23);

        let mut iter = vec.into_iter();
        assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
        assert!(iter.get(21) == Some(Cow::Borrowed(&21)));
        assert!(iter.get(22) == Some(Cow::Borrowed(&22)));
        assert!(iter.get(23).is_none());
        drop(iter);

        vec.flush()?;
    }

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        assert!(vec.header().stamp() == Stamp::new(100));

        assert!(vec.stored_len() == 23);
        assert!(vec.pushed_len() == 0);
        assert!(vec.len() == 23);

        let mut iter = vec.into_iter();
        assert!(iter.get(0) == Some(Cow::Borrowed(&0)));
        assert!(iter.get(20) == Some(Cow::Borrowed(&20)));
        assert!(iter.get(21) == Some(Cow::Borrowed(&21)));
        assert!(iter.get(22) == Some(Cow::Borrowed(&22)));
        drop(iter);

        vec.truncate_if_needed(14)?;

        assert_eq!(vec.stored_len(), 14);
        assert_eq!(vec.pushed_len(), 0);
        assert_eq!(vec.len(), 14);

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
        assert_eq!(iter.get(20), None);
        drop(iter);

        assert_eq!(
            vec.collect_signed_range(Some(-5), None)?,
            vec![9, 10, 11, 12, 13]
        );

        vec.push(vec.len() as u32);
        assert_eq!(
            VecIterator::last(vec.into_iter()),
            Some((14, Cow::Borrowed(&14)))
        );

        assert_eq!(
            vec.into_iter()
                .map(|(_, v)| v.into_owned())
                .collect::<Vec<_>>(),
            vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
        );

        vec.flush()?;
    }

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        assert_eq!(
            VecIterator::last(vec.into_iter()),
            Some((14, Cow::Borrowed(&14)))
        );

        assert_eq!(
            vec.into_iter()
                .map(|(_, v)| v.into_owned())
                .collect::<Vec<_>>(),
            vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
        );

        vec.reset()?;

        assert_eq!(vec.pushed_len(), 0);
        assert_eq!(vec.stored_len(), 0);
        assert_eq!(vec.len(), 0);

        (0..21_u32).for_each(|v| {
            vec.push(v);
        });

        assert_eq!(vec.pushed_len(), 21);
        assert_eq!(vec.stored_len(), 0);
        assert_eq!(vec.len(), 21);

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert!(iter.get(21).is_none());
        drop(iter);

        let reader = vec.create_static_reader();
        assert_eq!(vec.take(10, &reader)?, Some(10));
        assert_eq!(vec.holes(), &BTreeSet::from([10]));
        assert!(vec.get_or_read(10, &reader)?.is_none());
        drop(reader);

        vec.flush()?;

        assert!(vec.holes() == &BTreeSet::from([10]));
    }

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        assert!(vec.holes() == &BTreeSet::from([10]));

        let reader = vec.create_static_reader();
        assert!(vec.get_or_read(10, &reader)?.is_none());
        drop(reader);

        vec.update(10, 10)?;
        vec.update(0, 10)?;

        let reader = vec.create_static_reader();
        assert_eq!(vec.holes(), &BTreeSet::new());
        assert_eq!(vec.get_or_read(0, &reader)?, Some(Cow::Borrowed(&10)));
        assert_eq!(vec.get_or_read(10, &reader)?, Some(Cow::Borrowed(&10)));
        drop(reader);

        vec.flush()?;
    }

    options = options.with_saved_stamped_changes(10);

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        assert_eq!(
            vec.collect()?,
            vec![
                10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
            ]
        );

        vec.truncate_if_needed(10)?;

        let reader = vec.create_static_reader();
        vec.take(5, &reader)?;
        vec.update(3, 5)?;
        vec.push(21);
        drop(reader);

        assert_eq!(
            vec.collect_holed()?,
            vec![
                Some(10),
                Some(1),
                Some(2),
                Some(5),
                Some(4),
                None,
                Some(6),
                Some(7),
                Some(8),
                Some(9),
                Some(21)
            ]
        );

        vec.stamped_flush(Stamp::new(1))?;
    }

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        assert_eq!(vec.collect()?, vec![10, 1, 2, 5, 4]);

        let reader = vec.create_static_reader();
        vec.take(0, &reader)?;
        vec.update(1, 5)?;
        vec.push(5);
        vec.push(6);
        vec.push(7);
        drop(reader);

        assert_eq!(
            vec.collect_holed()?,
            vec![
                None,
                Some(5),
                Some(2),
                Some(5),
                Some(4),
                None,
                Some(6),
                Some(7),
                Some(8),
                Some(9),
                Some(21),
                Some(5),
                Some(6),
                Some(7)
            ]
        );

        vec.stamped_flush(Stamp::new(2))?;
    }

    {
        let mut vec: VEC = RawVec::forced_import_with(options)?;

        assert_eq!(
            vec.collect_holed()?,
            vec![
                None,
                Some(5),
                Some(2),
                Some(5),
                Some(4),
                None,
                Some(6),
                Some(7),
                Some(8),
                Some(9),
                Some(21),
                Some(5),
                Some(6),
                Some(7)
            ]
        );

        vec.rollback_stamp(Stamp::new(2))?;

        assert_eq!(
            vec.collect_holed()?,
            vec![
                Some(10),
                Some(1),
                Some(2),
                Some(5),
                Some(4),
                None,
                Some(6),
                Some(7),
                Some(8),
                Some(9),
                Some(21)
            ]
        );

        vec.rollback_stamp(Stamp::new(1))?;

        assert_eq!(
            vec.collect()?,
            vec![
                10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
            ]
        );

        // vec.stamped_flush(Stamp::new(1))?;
    }

    Ok(())
}

§Compressed

use std::{borrow::Cow, collections::BTreeSet, fs, path::Path};

use vecdb::{
    AnyStoredVec, AnyVec, CollectableVec, CompressedVec, Database, GenericStoredVec, Stamp,
    VecIterator, Version,
};

#[allow(clippy::upper_case_acronyms)]
type VEC = CompressedVec<usize, u32>;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let _ = fs::remove_dir_all("compressed");

    let version = Version::TWO;

    let database = Database::open(Path::new("compressed"))?;

    let options = (&database, "vec", version).into();

    {
        let mut vec: VEC = CompressedVec::forced_import_with(options)?;

        (0..21_u32).for_each(|v| {
            vec.push(v);
        });

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(1), Some(Cow::Borrowed(&1)));
        assert_eq!(iter.get(2), Some(Cow::Borrowed(&2)));
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert_eq!(iter.get(21), None);
        drop(iter);

        vec.flush()?;

        assert_eq!(vec.header().stamp(), Stamp::new(0));
    }

    {
        let mut vec: VEC = CompressedVec::forced_import_with(options)?;

        vec.mut_header().update_stamp(Stamp::new(100));

        assert!(vec.header().stamp() == Stamp::new(100));

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(1), Some(Cow::Borrowed(&1)));
        assert_eq!(iter.get(2), Some(Cow::Borrowed(&2)));
        assert_eq!(iter.get(3), Some(Cow::Borrowed(&3)));
        assert_eq!(iter.get(4), Some(Cow::Borrowed(&4)));
        assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        drop(iter);

        vec.push(21);
        vec.push(22);

        assert_eq!(vec.stored_len(), 21);
        assert_eq!(vec.pushed_len(), 2);
        assert_eq!(vec.len(), 23);

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert_eq!(iter.get(21), Some(Cow::Borrowed(&21)));
        assert_eq!(iter.get(22), Some(Cow::Borrowed(&22)));
        assert_eq!(iter.get(23), None);
        drop(iter);

        vec.flush()?;
    }

    {
        let mut vec: VEC = CompressedVec::forced_import_with(options)?;

        assert_eq!(vec.header().stamp(), Stamp::new(100));

        assert_eq!(vec.stored_len(), 23);
        assert_eq!(vec.pushed_len(), 0);
        assert_eq!(vec.len(), 23);

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert_eq!(iter.get(21), Some(Cow::Borrowed(&21)));
        assert_eq!(iter.get(22), Some(Cow::Borrowed(&22)));
        drop(iter);

        vec.truncate_if_needed(14)?;

        assert_eq!(vec.stored_len(), 14);
        assert_eq!(vec.pushed_len(), 0);
        assert_eq!(vec.len(), 14);

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
        assert_eq!(iter.get(20), None);
        drop(iter);

        assert_eq!(
            vec.collect_signed_range(Some(-5), None)?,
            vec![9, 10, 11, 12, 13]
        );

        vec.push(vec.len() as u32);
        assert_eq!(
            VecIterator::last(vec.into_iter()),
            Some((14, Cow::Borrowed(&14)))
        );

        vec.flush()?;

        assert_eq!(
            vec.into_iter()
                .map(|(_, v)| v.into_owned())
                .collect::<Vec<_>>(),
            vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
        );
    }

    {
        let mut vec: VEC = CompressedVec::forced_import_with(options)?;

        assert_eq!(
            vec.into_iter()
                .map(|(_, v)| v.into_owned())
                .collect::<Vec<_>>(),
            vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
        );

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(5), Some(Cow::Borrowed(&5)));
        assert_eq!(iter.get(20), None);
        drop(iter);

        assert_eq!(
            vec.collect_signed_range(Some(-5), None)?,
            vec![10, 11, 12, 13, 14]
        );

        vec.reset()?;

        assert_eq!(vec.pushed_len(), 0);
        assert_eq!(vec.stored_len(), 0);
        assert_eq!(vec.len(), 0);

        (0..21_u32).for_each(|v| {
            vec.push(v);
        });

        assert_eq!(vec.pushed_len(), 21);
        assert_eq!(vec.stored_len(), 0);
        assert_eq!(vec.len(), 21);

        let mut iter = vec.into_iter();
        assert_eq!(iter.get(0), Some(Cow::Borrowed(&0)));
        assert_eq!(iter.get(20), Some(Cow::Borrowed(&20)));
        assert_eq!(iter.get(21), None);
        drop(iter);

        vec.flush()?;
    }

    {
        let mut vec: VEC = CompressedVec::forced_import_with(options)?;

        assert_eq!(vec.pushed_len(), 0);
        assert_eq!(vec.stored_len(), 21);
        assert_eq!(vec.len(), 21);

        let reader = vec.create_static_reader();
        assert_eq!(vec.holes(), &BTreeSet::new());
        assert_eq!(vec.get_or_read(0, &reader)?, Some(Cow::Borrowed(&0)));
        assert_eq!(vec.get_or_read(10, &reader)?, Some(Cow::Borrowed(&10)));
        drop(reader);

        vec.flush()?;
    }

    {
        let vec: VEC = CompressedVec::forced_import_with(options)?;

        assert!(
            vec.collect()?
                == vec![
                    0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
                ]
        );
    }

    Ok(())
}

Structs§

CompressedVec
Database
EagerVec
Exit
LazyVecFrom1
LazyVecFrom2
LazyVecFrom3
RawVec
Reader
Stamp
Version

Enums§

Computation
ComputedVec
Error
Format
SeqDBError
StoredVec

Constants§

PAGE_SIZE

Traits§

AnyCloneableIterableVec
AnyCollectableVec
AnyIterableVec
AnyStoredIterableVec
AnyStoredVec
AnyVec
AsInnerSlice
BaseVecIterator
CheckedSub
CollectableVec
FromCoarserIndex
FromInnerSlice
GenericStoredVec
Printable
StoredCompressed
StoredIndex
StoredRaw
TransparentStoredCompressed
VecIterator

Functions§

i64_to_usize

Type Aliases§

AnyBoxedIterableVec
BoxedVecIterator
ComputedVecFrom1
ComputedVecFrom2
ComputedVecFrom3
Result

Derive Macros§

StoredCompressed