Skip to main content

Crate rhei

Crate rhei 

Source
Expand description

Rhei — lightweight serverless HTAP engine for Rust.

Rhei pairs Rusqlite (SQLite) for low-latency OLTP writes with a pluggable OLAP backend (Apache DataFusion or DuckDB) for analytical queries, bridged by trigger-based Change Data Capture (CDC) replication and automatic SQL query routing. No separate server process is required — the engine runs fully in-process.

§Architecture

Standard HTAP mode:

Client
  |
  v
HtapEngine  (facade — this crate)
  |-- SqlParserRouter  ──► AST-based routing to OLTP or OLAP
  |-- RusqliteEngine   ──► WAL-mode SQLite, write conn + read pool
  |     └── RusqliteCdcProducer  ──► trigger-based _rhei_cdc_log
  |-- OlapBackend      ──► DuckDB or DataFusion (feature-gated)
  └── CdcSyncEngine    ──► polls CDC, applies DML to OLAP

Sidecar mode:

External DB (SQLite / PostgreSQL)
  │  polls by updated_at > watermark
  ▼
TimestampCdcConsumer<S: SourceConnector>
  │  CdcEvent stream
  ▼
CdcSyncEngine  (temporal or destructive)
  │  DML -> OLAP
  ▼
OlapBackend (DuckDB / DataFusion)

§Operating modes

§Standard HTAP mode

The default mode. A local SQLite database handles writes; CDC triggers capture every INSERT / UPDATE / DELETE into _rhei_cdc_log. A background sync loop (or manual HtapEngine::sync_now) applies those events to the OLAP engine so analytical queries see fresh data.

§Sidecar mode

When [HtapConfig::sidecar] is Some(SidecarConfig { .. }), the engine follows an external database (SQLite or PostgreSQL) via timestamp-based CDC polling. No local SQLite write path is needed unless [SidecarConfig::enable_local_oltp] is true. Useful for adding OLAP capabilities to an existing application database without schema changes.

§Sync modes

ModeBehaviour
rhei_core::SyncMode::DestructiveMirror semantics: UPDATE overwrites the row, DELETE removes it. Default.
rhei_core::SyncMode::TemporalSCD Type 2: every change appends a new version with _rhei_valid_from / _rhei_valid_to / _rhei_operation columns. Enables point-in-time queries.

§Feature flags

FeatureDefaultDescription
datafusion-backendyesApache DataFusion OLAP engine (pure Rust, natively async)
duckdb-backendnoDuckDB OLAP engine (C++ bundled via spawn_blocking)
fullnoBoth OLAP backends simultaneously
sidecarnoTimestamp-based CDC from SQLite or PostgreSQL external sources
rocksdb-cdcnoRocksDB-backed durable CDC log (5–7× faster writes than SQLite triggers)
flight-sqlnoArrow Flight SQL gRPC server (see rhei-flight crate)
metricsnoCounters/gauges emitted via the metrics crate facade
metrics-exporternoPrometheus HTTP endpoint (configured via metrics_port in TOML)
cloud-storagenoS3 / GCS object-store backends for DataFusion Parquet storage

§Quick start

use rhei::{HtapConfig, HtapEngine, TableSchema};
use arrow::datatypes::{DataType, Field, Schema};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Configure the engine (in-memory OLAP, default DataFusion backend)
    let config = HtapConfig {
        oltp_path: "my_app.db".to_string(),
        olap_in_memory: true,
        ..HtapConfig::default()
    };

    // 2. Start the engine
    let mut engine = HtapEngine::new(config).await?;

    // 3. Register a table for HTAP replication
    let schema = TableSchema {
        name: "events".to_string(),
        arrow_schema: Arc::new(Schema::new(vec![
            Field::new("id",    DataType::Int64,  false),
            Field::new("label", DataType::Utf8,   true),
        ])),
        primary_key: vec!["id".to_string()],
    };
    engine.register_table(schema).await?;

    // 4. Write via OLTP (auto-routed by SqlParserRouter)
    engine.execute("INSERT INTO events VALUES (1, 'hello')", &[]).await?;

    // 5. Sync CDC events to OLAP
    engine.sync_now().await?;

    // 6. Query via OLAP (aggregate → routed to DataFusion)
    let batches = engine.query("SELECT COUNT(*) FROM events").await?;
    println!("{} batches returned", batches.len());

    // 7. Shut down cleanly
    engine.shutdown().await;
    Ok(())
}

§Key types

  • HtapEngine — the main engine facade; start here.
  • HtapConfig — builder-style configuration.
  • [SidecarConfig] — sidecar-specific settings (watermark persistence, delete detection, OLTP toggle).
  • CdcSource — unified CDC consumer wrapper used internally by the sync engine.

Re-exports§

pub use error::HtapError;

Modules§

error
Error types for the top-level HTAP engine facade.

Structs§

CdcEvent
A single CDC event captured from the OLTP engine.
CdcSyncEngine
CDC-based sync engine that replicates changes from OLTP to OLAP.
DataFusionEngine
DataFusion-backed OLAP engine.
HeuristicRouter
Backwards-compatible query router that delegates to SqlParserRouter.
HtapConfig
Configuration for the HtapEngine.
HtapEngine
Main HTAP engine facade.
RusqliteCdcProducer
Reads CDC events from the _rhei_cdc_log table in the OLTP database.
RusqliteEngine
Rusqlite-backed OLTP engine.
SchemaRegistry
Thread-safe registry of table schemas.
SharedDataFusionEngine
A cheaply-cloneable, Arc-wrapped DataFusionEngine that implements rhei_core::OlapEngine.
SqlParserRouter
SQL parser-based query router that classifies SQL using a real AST.
SyncResult
Summary of a single crate::SyncEngine::sync_once cycle.
SyncStatus
A point-in-time snapshot of the crate::SyncEngine’s replication state.
TableSchema
Schema definition for a table tracked by the HTAP engine.

Enums§

CdcOperation
The type of a single CDC (Change Data Capture) operation.
CdcSource
When the sidecar feature is disabled, CdcSource wraps the Rusqlite CDC producer.
OlapBackend
Unified OLAP backend that wraps all supported engines.
OlapBackendType
Selects the OLAP query engine used by the HtapEngine.
OlapEngineType
Discriminant for the active OLAP engine backend.
OlapError
Error type returned by crate::OlapBackend method implementations.
OltpBackend
Unified OLTP backend enum.
OltpError
Error type returned by crate::OltpBackend method implementations.
QueryHint
Optional caller hint to override automatic SQL routing.
QueryTarget
The engine that should execute a given SQL statement.
RusqliteOltpError
Unified error type for the Rusqlite OLTP engine.
StorageMode
Storage mode for the DataFusion OLAP engine.
SyncMode
Controls how the crate::SyncEngine applies CDC events to the OLAP engine.

Traits§

CdcConsumer
Consumes CDC (Change Data Capture) events from the OLTP engine.
OlapEngine
Abstraction over an OLAP (analytical) query engine.
OltpEngine
Abstraction over the OLTP (transactional) engine.
QueryRouter
Classifies SQL statements and routes them to the correct engine.
SyncEngine
Applies CDC events from the OLTP engine to the OLAP engine.

Functions§

spawn_sync_loop
Spawn a background task that continuously polls CDC and syncs to OLAP.
temporalize_schema
Extend an Arrow schema with temporal columns for SCD Type 2.