rhei 1.5.0

Lightweight serverless HTAP engine — Rusqlite (OLTP) + DuckDB/DataFusion (OLAP) with CDC replication
Documentation

Rhei — lightweight serverless HTAP engine for Rust.

Rhei pairs Rusqlite (SQLite) for low-latency OLTP writes with a pluggable OLAP backend (Apache DataFusion or DuckDB) for analytical queries, bridged by trigger-based Change Data Capture (CDC) replication and automatic SQL query routing. No separate server process is required — the engine runs fully in-process.

Architecture

Standard HTAP mode:

Client
  |
  v
HtapEngine  (facade — this crate)
  |-- SqlParserRouter  ──► AST-based routing to OLTP or OLAP
  |-- RusqliteEngine   ──► WAL-mode SQLite, write conn + read pool
  |     └── RusqliteCdcProducer  ──► trigger-based _rhei_cdc_log
  |-- OlapBackend      ──► DuckDB or DataFusion (feature-gated)
  └── CdcSyncEngine    ──► polls CDC, applies DML to OLAP

Sidecar mode:

External DB (SQLite / PostgreSQL)
  │  polls by updated_at > watermark
  ▼
TimestampCdcConsumer<S: SourceConnector>
  │  CdcEvent stream
  ▼
CdcSyncEngine  (temporal or destructive)
  │  DML -> OLAP
  ▼
OlapBackend (DuckDB / DataFusion)

Operating modes

Standard HTAP mode

The default mode. A local SQLite database handles writes; CDC triggers capture every INSERT / UPDATE / DELETE into _rhei_cdc_log. A background sync loop (or manual [HtapEngine::sync_now]) applies those events to the OLAP engine so analytical queries see fresh data.

Sidecar mode

When [HtapConfig::sidecar] is Some(SidecarConfig { .. }), the engine follows an external database (SQLite or PostgreSQL) via timestamp-based CDC polling. No local SQLite write path is needed unless [SidecarConfig::enable_local_oltp] is true. Useful for adding OLAP capabilities to an existing application database without schema changes.

Sync modes

Mode Behaviour
[rhei_core::SyncMode::Destructive] Mirror semantics: UPDATE overwrites the row, DELETE removes it. Default.
[rhei_core::SyncMode::Temporal] SCD Type 2: every change appends a new version with _rhei_valid_from / _rhei_valid_to / _rhei_operation columns. Enables point-in-time queries.

Feature flags

Feature Default Description
datafusion-backend yes Apache DataFusion OLAP engine (pure Rust, natively async)
duckdb-backend no DuckDB OLAP engine (C++ bundled via spawn_blocking)
full no Both OLAP backends simultaneously
sidecar no Timestamp-based CDC from SQLite or PostgreSQL external sources
rocksdb-cdc no RocksDB-backed durable CDC log (5–7× faster writes than SQLite triggers)
flight-sql no Arrow Flight SQL gRPC server (see rhei-flight crate)
metrics no Counters/gauges emitted via the metrics crate facade
metrics-exporter no Prometheus HTTP endpoint (configured via metrics_port in TOML)
cloud-storage no S3 / GCS object-store backends for DataFusion Parquet storage

Quick start

use rhei::{HtapConfig, HtapEngine, TableSchema};
use arrow::datatypes::{DataType, Field, Schema};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Configure the engine (in-memory OLAP, default DataFusion backend)
    let config = HtapConfig {
        oltp_path: "my_app.db".to_string(),
        olap_in_memory: true,
        ..HtapConfig::default()
    };

    // 2. Start the engine
    let mut engine = HtapEngine::new(config).await?;

    // 3. Register a table for HTAP replication
    let schema = TableSchema {
        name: "events".to_string(),
        arrow_schema: Arc::new(Schema::new(vec![
            Field::new("id",    DataType::Int64,  false),
            Field::new("label", DataType::Utf8,   true),
        ])),
        primary_key: vec!["id".to_string()],
    };
    engine.register_table(schema).await?;

    // 4. Write via OLTP (auto-routed by SqlParserRouter)
    engine.execute("INSERT INTO events VALUES (1, 'hello')", &[]).await?;

    // 5. Sync CDC events to OLAP
    engine.sync_now().await?;

    // 6. Query via OLAP (aggregate → routed to DataFusion)
    let batches = engine.query("SELECT COUNT(*) FROM events").await?;
    println!("{} batches returned", batches.len());

    // 7. Shut down cleanly
    engine.shutdown().await;
    Ok(())
}

Key types

  • [HtapEngine] — the main engine facade; start here.
  • [HtapConfig] — builder-style configuration.
  • [SidecarConfig] — sidecar-specific settings (watermark persistence, delete detection, OLTP toggle).
  • [CdcSource] — unified CDC consumer wrapper used internally by the sync engine.