jdb_trait 0.1.2

异步存储引擎数据库抽象层 / Async database abstraction layer for storage engines
docs.rs failed to build jdb_trait-0.1.2
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

English | 中文


jdb_trait: Database Abstraction Layer for Async Storage Engines

Table of Contents

Overview

jdb_trait defines async trait interfaces for building database storage engines. It provides abstractions for tables, sub-tables (partitions), schemas, queries, and row data with support for key-value separation.

Features

  • Async-first design with Future-based APIs
  • Sub-table partitioning for horizontal scaling
  • Schema versioning with TTL and depth control
  • Flexible query expressions with AND/OR/NOT logic
  • Key-value separation via AsyncRow trait
  • Zero-copy string/binary types using HipStr/HipByt
  • Type-safe value representation with Val enum

Installation

[dependencies]
jdb_trait = "0.1"

Core Concepts

Engine → Table → SubTable

Engine
  └── Table (with Schema)
        └── SubTable (partition by SubTableKey)
              └── Row (Vec<Val>)
  • Engine: Entry point for opening/creating tables
  • Table: Manages schema and routes operations to sub-tables
  • SubTable: Partition holding actual row data
  • Row: Synchronous row data (Vec<Val>)
  • AsyncRow: Async row accessor for key-value separation

Query Flow

graph TD
  A[Query] --> B{sub_table_filter}
  B -->|match| C[SubTable]
  C --> D{val_filter}
  D -->|match| E[AsyncRow]
  E --> F[Row data]

API Reference

Types

Type Description
Id Record identifier (u64)
Col Column name (HipByt<'static>)
ColIdx Column index (u16)
Row Synchronous row data (Vec<Val>)
SubTableKey Partition routing key (Row)

Val

Atomic database value supporting multiple types:

pub enum Val {
  Bool(bool),
  I8(i8), I16(i16), I32(i32), I64(i64), I128(i128),
  U8(u8), U16(u16), U32(u32), U64(u64), U128(u128),
  F32(OrderedFloat<f32>), F64(OrderedFloat<f64>),
  Str(HipStr<'static>),
  Bin(HipByt<'static>),
}

Schema

Table schema with versioning:

pub struct Schema {
  pub name: HipByt<'static>,
  pub ver: SchemaVer,
  pub col_li: Vec<Field>,
  pub sub_table_key_li: Vec<Field>,
  pub index_li: Vec<Index>,
  pub max_depth: Option<usize>,
  pub ttl: Option<Duration>,
}

Query & Expr

Query builder with filter expressions:

pub struct Query {
  pub sub_table_filter: Option<Expr>,
  pub val_filter: Option<Expr>,
  pub limit: Option<usize>,
  pub offset: Option<usize>,
  pub order: Order,
}

Expression operators:

Op Description
Eq(Val) Equality
In(HashSet<Val>) Set membership
Range(Val, Val) Half-open interval [start, end)
RangeInclusive(Val, Val) Closed interval [start, end]
RangeFrom(Val) [start, +∞)
RangeTo(Val) (-∞, end)
RangeToInclusive(Val) (-∞, end]

Traits

Engine

pub trait Engine: Sized + Send + Sync {
  type Error: Debug + Send + Sync;
  type Gen: IdGen;
  type Table: Table;

  fn id_gen(&self) -> &Self::Gen;
  fn open<F, Fut>(&self, name: &[u8], create: F)
    -> impl Future<Output = Result<Self::Table, Self::Error>> + Send;
}

Table

pub trait Table: Sized + Send + Sync {
  type Error: Debug + Send + Sync;
  type SubTable: SubTable;
  type AsyncRow: AsyncRow;
  type Stream: Stream<Item = Result<AsyncItem<Self::AsyncRow>, Self::Error>> + Send;

  fn schema(&self) -> impl Future<Output = Schema> + Send;
  fn put(&self, key: &SubTableKey, data: &[Row])
    -> impl Future<Output = Result<Vec<Id>, Self::Error>> + Send;
  fn get(&self, key: &SubTableKey, id: Id)
    -> impl Future<Output = Result<Option<AsyncItem<Self::AsyncRow>>, Self::Error>> + Send;
  fn select(&self, q: &Query) -> impl Future<Output = Self::Stream> + Send;
  fn scan(&self, begin_id: u64, order: Order) -> impl Future<Output = Self::Stream> + Send;
  fn rm(&self, q: &Query) -> impl Future<Output = Result<u64, Self::Error>> + Send;
  // ...
}

SubTable

pub trait SubTable: Send + Sync {
  type Error: Debug + Send + Sync;
  type AsyncRow: AsyncRow;
  type Stream: Stream<Item = Result<(Id, Self::AsyncRow), Self::Error>> + Send;

  fn put(&self, data: &[Row])
    -> impl Future<Output = Result<Vec<Id>, Self::Error>> + Send;
  fn get(&self, id: Id)
    -> impl Future<Output = Result<Option<(Id, Self::AsyncRow)>, Self::Error>> + Send;
  fn select(&self, q: &Query) -> impl Future<Output = Self::Stream> + Send;
  fn key(&self) -> &SubTableKey;
  // ...
}

AsyncRow

pub trait AsyncRow: Send + Sync + Debug {
  type Error: Debug + Send + Sync;
  fn row(&self) -> impl Future<Output = Result<Row, Self::Error>> + Send;
}

Architecture

graph TD
  subgraph Traits
    Engine --> Table
    Table --> SubTable
    Table --> Schema
    SubTable --> AsyncRow
    AsyncRow --> Row
  end

  subgraph Data
    Row --> Val
    Query --> Expr
    Expr --> Op
  end

  subgraph Types
    Id
    Col
    ColIdx
    SubTableKey
  end

Call Flow

  1. Engine::open() creates or opens Table
  2. Table routes by SubTableKey to SubTable
  3. SubTable executes CRUD operations
  4. Query results return AsyncRow for lazy loading
  5. AsyncRow::row() fetches actual Row data

Tech Stack

Dependency Purpose
futures-core Stream trait for async iteration
hipstr Zero-copy string/binary types
ordered-float Orderable float wrapper
gxhash Fast hash for HashSet<Val>

Directory Structure

jdb_trait/
├── src/
│   ├── lib.rs        # Public exports, Engine, IdGen, AsyncItem
│   ├── val.rs        # Val enum with From impls
│   ├── row.rs        # Row type alias, AsyncRow trait
│   ├── expr.rs       # Expr, Op, Order
│   ├── query.rs      # Query struct
│   ├── schema.rs     # Schema, Field, Index
│   ├── sub_table.rs  # SubTable trait
│   └── table.rs      # Table trait
├── readme/
│   ├── en.md
│   └── zh.md
└── Cargo.toml

History

The concept of database abstraction layers traces back to the 1970s when E.F. Codd proposed the relational model. The separation of logical and physical data representation became foundational to modern databases.

Key-value separation, central to AsyncRow, emerged from LSM-tree optimizations. WiscKey (2016) demonstrated that separating keys from values in SSTable-based storage significantly improves write amplification and space efficiency for large values.

The async trait pattern in Rust evolved significantly. Before Rust 1.75 (December 2023), async methods in traits required workarounds like async-trait crate. Native support for impl Trait in trait methods enabled cleaner APIs like those in jdb_trait.

Sub-table partitioning reflects distributed database designs from Google's Bigtable (2006) and Apache HBase, where row key prefixes route data to specific tablets/regions for horizontal scaling.


About

This project is an open-source component of js0.site ⋅ Refactoring the Internet Plan.

We are redefining the development paradigm of the Internet in a componentized way. Welcome to follow us:


jdb_trait: 异步存储引擎数据库抽象层

目录

概述

jdb_trait 定义异步 trait 接口,用于构建数据库存储引擎。提供表、子表(分区)、Schema、查询、行数据等抽象,支持键值分离。

特性

  • 异步优先设计,基于 Future 的 API
  • 子表分区,支持水平扩展
  • Schema 版本控制,支持 TTL 和深度限制
  • 灵活的查询表达式,支持 AND/OR/NOT 逻辑
  • 通过 AsyncRow trait 实现键值分离
  • 使用 HipStr/HipByt 实现零拷贝字符串/二进制
  • 类型安全的 Val 枚举值表示

安装

[dependencies]
jdb_trait = "0.1"

核心概念

Engine → Table → SubTable

Engine
  └── Table (含 Schema)
        └── SubTable (按 SubTableKey 分区)
              └── Row (Vec<Val>)
  • Engine: 打开/创建表的入口
  • Table: 管理 Schema,路由操作到子表
  • SubTable: 存储实际行数据的分区
  • Row: 同步行数据 (Vec<Val>)
  • AsyncRow: 异步行访问器,用于键值分离

查询流程

graph TD
  A[Query] --> B{sub_table_filter}
  B -->|匹配| C[SubTable]
  C --> D{val_filter}
  D -->|匹配| E[AsyncRow]
  E --> F[Row 数据]

API 参考

类型

类型 说明
Id 记录标识符 (u64)
Col 列名 (HipByt<'static>)
ColIdx 列索引 (u16)
Row 同步行数据 (Vec<Val>)
SubTableKey 分区路由键 (Row)

Val

支持多种类型的原子数据库值:

pub enum Val {
  Bool(bool),
  I8(i8), I16(i16), I32(i32), I64(i64), I128(i128),
  U8(u8), U16(u16), U32(u32), U64(u64), U128(u128),
  F32(OrderedFloat<f32>), F64(OrderedFloat<f64>),
  Str(HipStr<'static>),
  Bin(HipByt<'static>),
}

Schema

带版本控制的表结构:

pub struct Schema {
  pub name: HipByt<'static>,
  pub ver: SchemaVer,
  pub col_li: Vec<Field>,
  pub sub_table_key_li: Vec<Field>,
  pub index_li: Vec<Index>,
  pub max_depth: Option<usize>,
  pub ttl: Option<Duration>,
}

Query & Expr

查询构建器与过滤表达式:

pub struct Query {
  pub sub_table_filter: Option<Expr>,
  pub val_filter: Option<Expr>,
  pub limit: Option<usize>,
  pub offset: Option<usize>,
  pub order: Order,
}

表达式操作符:

Op 说明
Eq(Val) 相等
In(HashSet<Val>) 集合成员
Range(Val, Val) 半开区间 [start, end)
RangeInclusive(Val, Val) 闭区间 [start, end]
RangeFrom(Val) [start, +∞)
RangeTo(Val) (-∞, end)
RangeToInclusive(Val) (-∞, end]

Traits

Engine

pub trait Engine: Sized + Send + Sync {
  type Error: Debug + Send + Sync;
  type Gen: IdGen;
  type Table: Table;

  fn id_gen(&self) -> &Self::Gen;
  fn open<F, Fut>(&self, name: &[u8], create: F)
    -> impl Future<Output = Result<Self::Table, Self::Error>> + Send;
}

Table

pub trait Table: Sized + Send + Sync {
  type Error: Debug + Send + Sync;
  type SubTable: SubTable;
  type AsyncRow: AsyncRow;
  type Stream: Stream<Item = Result<AsyncItem<Self::AsyncRow>, Self::Error>> + Send;

  fn schema(&self) -> impl Future<Output = Schema> + Send;
  fn put(&self, key: &SubTableKey, data: &[Row])
    -> impl Future<Output = Result<Vec<Id>, Self::Error>> + Send;
  fn get(&self, key: &SubTableKey, id: Id)
    -> impl Future<Output = Result<Option<AsyncItem<Self::AsyncRow>>, Self::Error>> + Send;
  fn select(&self, q: &Query) -> impl Future<Output = Self::Stream> + Send;
  fn scan(&self, begin_id: u64, order: Order) -> impl Future<Output = Self::Stream> + Send;
  fn rm(&self, q: &Query) -> impl Future<Output = Result<u64, Self::Error>> + Send;
  // ...
}

SubTable

pub trait SubTable: Send + Sync {
  type Error: Debug + Send + Sync;
  type AsyncRow: AsyncRow;
  type Stream: Stream<Item = Result<(Id, Self::AsyncRow), Self::Error>> + Send;

  fn put(&self, data: &[Row])
    -> impl Future<Output = Result<Vec<Id>, Self::Error>> + Send;
  fn get(&self, id: Id)
    -> impl Future<Output = Result<Option<(Id, Self::AsyncRow)>, Self::Error>> + Send;
  fn select(&self, q: &Query) -> impl Future<Output = Self::Stream> + Send;
  fn key(&self) -> &SubTableKey;
  // ...
}

AsyncRow

pub trait AsyncRow: Send + Sync + Debug {
  type Error: Debug + Send + Sync;
  fn row(&self) -> impl Future<Output = Result<Row, Self::Error>> + Send;
}

架构

graph TD
  subgraph Traits
    Engine --> Table
    Table --> SubTable
    Table --> Schema
    SubTable --> AsyncRow
    AsyncRow --> Row
  end

  subgraph Data
    Row --> Val
    Query --> Expr
    Expr --> Op
  end

  subgraph Types
    Id
    Col
    ColIdx
    SubTableKey
  end

调用流程

  1. Engine::open() 创建或打开 Table
  2. TableSubTableKey 路由到 SubTable
  3. SubTable 执行 CRUD 操作
  4. 查询结果返回 AsyncRow 实现延迟加载
  5. AsyncRow::row() 获取实际 Row 数据

技术栈

依赖 用途
futures-core 异步迭代的 Stream trait
hipstr 零拷贝字符串/二进制类型
ordered-float 可排序浮点数包装
gxhash HashSet<Val> 的快速哈希

目录结构

jdb_trait/
├── src/
│   ├── lib.rs        # 公开导出、Engine、IdGen、AsyncItem
│   ├── val.rs        # Val 枚举及 From 实现
│   ├── row.rs        # Row 类型别名、AsyncRow trait
│   ├── expr.rs       # Expr、Op、Order
│   ├── query.rs      # Query 结构体
│   ├── schema.rs     # Schema、Field、Index
│   ├── sub_table.rs  # SubTable trait
│   └── table.rs      # Table trait
├── readme/
│   ├── en.md
│   └── zh.md
└── Cargo.toml

历史

数据库抽象层概念可追溯至 1970 年代 E.F. Codd 提出的关系模型。逻辑与物理数据表示的分离成为现代数据库的基石。

键值分离是 AsyncRow 的核心思想,源于 LSM-tree 优化。WiscKey(2016)证明在基于 SSTable 的存储中分离键值,能显著改善大值场景下的写放大和空间效率。

Rust 的 async trait 模式经历重大演进。在 Rust 1.75(2023 年 12 月)之前,trait 中的异步方法需要 async-trait crate 等变通方案。原生支持 trait 方法中的 impl Trait 后,jdb_trait 这类更简洁的 API 成为可能。

子表分区反映了 Google Bigtable(2006)和 Apache HBase 等分布式数据库设计,通过行键前缀将数据路由到特定 tablet/region 实现水平扩展。


关于

本项目为 js0.site ⋅ 重构互联网计划 的开源组件。

我们正在以组件化的方式重新定义互联网的开发范式,欢迎关注: