vantage-dataset 0.4.0

Dataset traits for the Vantage data framework
Documentation

Vantage Dataset

A composable data abstraction framework for Rust that enables building type-safe CRUD operations across different data sources. Provides foundational traits for implementing dataset operations that work seamlessly with CSV files, message queues, databases, key-value stores, and in-memory collections.

Example Use

use vantage_dataset::{ImDataSource, ImTable, InsertableDataSet, ReadableDataSet};
use serde::{Deserialize, Serialize};

// Define your entity
#[derive(Serialize, Deserialize, Clone)]
struct User {
    name: String,
    email: String,
    age: u32,
}

// Work with any data source through common interface
let im_source = ImDataSource::new();
let users: ImTable<User> = ImTable::new(&im_source, "users");

// Insert and retrieve with type safety
let id = users.insert_return_id(User {
    name: "Alice".to_string(),
    email: "alice@example.com".to_string(),
    age: 30,
}).await?;

let all_users = users.list().await?;
let alice = users.get(&id).await?;

The trait system allows the same code patterns to work across vastly different storage backends - from local CSV files to distributed databases.

Features

  • Two-layer trait system: [ValueSet] for raw values, [DataSet] for typed entities
  • Capability-based traits: [ReadableDataSet], [InsertableDataSet], [WritableDataSet] represent what operations a data source supports
  • Type-safe operations: Work with native Rust structs while maintaining storage abstraction
  • Record change tracking: RecordDataSet provides edit sessions with automatic persistence
  • Value-level access: [ValueSet] traits for working with raw JSON-like values
  • In-memory implementation: [ImTable] for local caching and testing scenarios
  • Cross-dataset imports: Move data between different storage types seamlessly
  • Enterprise-ready: Built for large codebases requiring storage migration and refactoring

Vantage Dataset solves the problem of data source abstraction by providing a unified interface that adapts to each storage backend's actual capabilities. Unlike traditional ORMs that force a lowest-common-denominator approach, Dataset traits let you express exactly what operations your data source supports.

Trait Architecture

The framework is built on a two-layer trait hierarchy:

ValueSet Foundation

The [ValueSet] layer works with raw storage values and defines the foundational types:

// Foundation - defines ID and Value types
trait ValueSet {
    type Id: Send + Sync + Clone;        // String, UUID, Thing, etc.
    type Value: Send + Sync + Clone;     // JSON Value, CBOR, etc.
}

// Value-level operations - work with raw storage data
trait ReadableValueSet: ValueSet {
    async fn list_values(&self) -> Result<IndexMap<Self::Id, Self::Value>>;
    async fn get_value(&self, id: &Self::Id) -> Result<Self::Value>;
    async fn get_some_value(&self) -> Result<Option<(Self::Id, Self::Value)>>;
}

trait WritableValueSet: ValueSet {
    async fn insert_value(&self, id: &Self::Id, record: Self::Value) -> Result<()>;
    async fn replace_value(&self, id: &Self::Id, record: Self::Value) -> Result<()>;
    async fn patch_value(&self, id: &Self::Id, partial: Self::Value) -> Result<()>;
    async fn delete(&self, id: &Self::Id) -> Result<()>;
    async fn delete_all(&self) -> Result<()>;
}

trait RecordValueSet: ReadableValueSet + WritableValueSet {
    async fn get_value_record(&self, id: &Self::Id) -> Result<RecordValue<'_, Self>>;
    async fn list_value_records(&self) -> Result<Vec<RecordValue<'_, Self>>>;
}

DataSet Entity Layer

The [DataSet] layer adds entity-awareness on top of ValueSet:

// Entity-aware operations built on ValueSet
trait DataSet<E: Entity>: ValueSet {}

// Capability traits - implement what your data source supports
trait ReadableDataSet<E>: DataSet<E> {
    async fn list(&self) -> Result<IndexMap<Self::Id, E>>;
    async fn get(&self, id: &Self::Id) -> Result<E>;

    async fn get_some(&self) -> Result<Option<(Self::Id, E)>>;
}

trait InsertableDataSet<E>: DataSet<E> {
    async fn insert_return_id(&self, record: E) -> Result<Self::Id>;
}

trait WritableDataSet<E>: DataSet<E> {
    async fn insert(&self, id: &Self::Id, record: E) -> Result<()>;
    async fn replace(&self, id: &Self::Id, record: E) -> Result<()>;
    async fn patch(&self, id: &Self::Id, partial: E) -> Result<()>;
}

trait RecordDataSet<E>: ReadableDataSet<E> + WritableDataSet<E> {
    async fn get_record(&self, id: &Self::Id) -> Result<Option<Record<'_, Self, E>>>;
    async fn list_records(&self) -> Result<Vec<Record<'_, Self, E>>>;
}

Data Source Capabilities

Different storage backends implement different combinations of traits based on their natural capabilities:

  • CSV Files: [ReadableValueSet] + [ReadableDataSet] - immutable data source
  • Message Queues: [InsertableDataSet] only - append-only event streams
  • Database Tables: Full [ValueSet] + [DataSet] hierarchy - complete CRUD operations
  • In-Memory Collections: All traits - complete local storage with change tracking
  • Key-Value Stores: [ValueSet] traits with atomic operations

This design prevents runtime errors by making data source limitations visible at compile time.

Value-Level Operations

Work directly with storage values without entity deserialization for performance-critical scenarios:

// Work directly with raw storage values
let raw_values = users.list_values().await?;
let user_value = users.get_value(&user_id).await?;

// Patch specific fields without full entity round-trip
let partial_update = json!({ "last_login": "2024-01-15T10:30:00Z" });
users.patch_value(&user_id, partial_update).await?;

// Delete operations at value level
users.delete(&user_id).await?;
users.delete_all().await?;

Value-level operations are essential for:

  • Cross-database migrations where type conversions may be complex
  • Performance-critical scenarios avoiding entity serialization overhead
  • Generic operations that work with any storage format

Record Pattern

The RecordDataSet and RecordValueSet traits provide change tracking for interactive editing:

// Get entity wrapped in Record for change tracking
let mut user_record = users.get_record(&user_id).await?.unwrap();

// Modify through mutable deref
user_record.email = "newemail@example.com".to_string();
user_record.age = 31;

// Persist changes automatically
user_record.save().await?;

// Or work with raw values
let mut value_record = users.get_value_record(&user_id).await?;
value_record["status"] = json!("active");
value_record.save().await?;

Records track modifications and only persist changed fields, enabling:

  • Efficient updates with minimal database operations
  • Conflict detection in concurrent environments
  • Undo/redo functionality for interactive applications
  • Optimistic locking patterns

In-Memory Implementation

The [ImTable] provides complete local storage implementation:

use vantage_dataset::{ImDataSource, ImTable};

let data_source = ImDataSource::new();
let users: ImTable<User> = ImTable::new(&data_source, "users");

// Full CRUD operations in memory
let id = users.insert_return_id(user).await?;
let all_users = users.list().await?;
let mut user_record = users.get_record(&id).await?.unwrap();
user_record.name = "Updated Name".to_string();
user_record.save().await?;

In-memory implementation is perfect for:

  • Testing scenarios requiring real async trait behavior
  • Local caching layers in distributed architectures
  • Rapid prototyping without external dependencies
  • Development environments

Cross-Dataset Operations

Import data between different storage types while preserving type safety:

// Import from CSV to database
let csv_users: CsvFile<User> = CsvFile::new(csv_source, "users.csv");
let db_users: DatabaseTable<User> = db.table("users");

// Type-safe import with automatic conversion
db_users.import(csv_users).await?;

The import system handles:

  • ID generation and mapping between different systems
  • Type conversion when storage formats differ
  • Batch operations for performance
  • Error recovery and partial import scenarios

Integration with Vantage Framework

Dataset traits form the foundation for higher-level Vantage components:

  • vantage-table: Builds structured tables with schema validation on top of Dataset traits
  • vantage-live: Provides caching and synchronization layers using Dataset as persistence
  • vantage-expressions: Query building that targets datasets as data sources
  • UI adapters: Generic table components that work with any Dataset implementation

This layered approach enables building complex data applications while maintaining clean separation of concerns and maximum reusability across different storage backends.

Error Handling

All traits use vantage_core::Result<T> and VantageError for consistent error handling across the Vantage ecosystem:

use vantage_core::{Result, VantageError};

// All trait methods return Result<T>
async fn get_user(&self, id: &str) -> Result<User> {
    self.get(id).await
        .context("Failed to retrieve user")
}

This ensures errors can be properly propagated and handled in applications using multiple Vantage components.