Expand description
Rhei — lightweight serverless HTAP engine for Rust.
Rhei pairs Rusqlite (SQLite) for low-latency OLTP writes with a pluggable OLAP backend (Apache DataFusion or DuckDB) for analytical queries, bridged by trigger-based Change Data Capture (CDC) replication and automatic SQL query routing. No separate server process is required — the engine runs fully in-process.
§Architecture
Standard HTAP mode:
Client
|
v
HtapEngine (facade — this crate)
|-- SqlParserRouter ──► AST-based routing to OLTP or OLAP
|-- RusqliteEngine ──► WAL-mode SQLite, write conn + read pool
| └── RusqliteCdcProducer ──► trigger-based _rhei_cdc_log
|-- OlapBackend ──► DuckDB or DataFusion (feature-gated)
└── CdcSyncEngine ──► polls CDC, applies DML to OLAP
Sidecar mode:
External DB (SQLite / PostgreSQL)
│ polls by updated_at > watermark
▼
TimestampCdcConsumer<S: SourceConnector>
│ CdcEvent stream
▼
CdcSyncEngine (temporal or destructive)
│ DML -> OLAP
▼
OlapBackend (DuckDB / DataFusion)§Operating modes
§Standard HTAP mode
The default mode. A local SQLite database handles writes; CDC triggers
capture every INSERT / UPDATE / DELETE into _rhei_cdc_log. A background
sync loop (or manual HtapEngine::sync_now) applies those events to the
OLAP engine so analytical queries see fresh data.
§Sidecar mode
When [HtapConfig::sidecar] is Some(SidecarConfig { .. }), the engine
follows an external database (SQLite or PostgreSQL) via timestamp-based
CDC polling. No local SQLite write path is needed unless
[SidecarConfig::enable_local_oltp] is true. Useful for adding OLAP
capabilities to an existing application database without schema changes.
§Sync modes
| Mode | Behaviour |
|---|---|
rhei_core::SyncMode::Destructive | Mirror semantics: UPDATE overwrites the row, DELETE removes it. Default. |
rhei_core::SyncMode::Temporal | SCD Type 2: every change appends a new version with _rhei_valid_from / _rhei_valid_to / _rhei_operation columns. Enables point-in-time queries. |
§Feature flags
| Feature | Default | Description |
|---|---|---|
datafusion-backend | yes | Apache DataFusion OLAP engine (pure Rust, natively async) |
duckdb-backend | no | DuckDB OLAP engine (C++ bundled via spawn_blocking) |
full | no | Both OLAP backends simultaneously |
sidecar | no | Timestamp-based CDC from SQLite or PostgreSQL external sources |
rocksdb-cdc | no | RocksDB-backed durable CDC log (5–7× faster writes than SQLite triggers) |
flight-sql | no | Arrow Flight SQL gRPC server (see rhei-flight crate) |
metrics | no | Counters/gauges emitted via the metrics crate facade |
metrics-exporter | no | Prometheus HTTP endpoint (configured via metrics_port in TOML) |
cloud-storage | no | S3 / GCS object-store backends for DataFusion Parquet storage |
§Quick start
use rhei::{HtapConfig, HtapEngine, TableSchema};
use arrow::datatypes::{DataType, Field, Schema};
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Configure the engine (in-memory OLAP, default DataFusion backend)
let config = HtapConfig {
oltp_path: "my_app.db".to_string(),
olap_in_memory: true,
..HtapConfig::default()
};
// 2. Start the engine
let mut engine = HtapEngine::new(config).await?;
// 3. Register a table for HTAP replication
let schema = TableSchema {
name: "events".to_string(),
arrow_schema: Arc::new(Schema::new(vec![
Field::new("id", DataType::Int64, false),
Field::new("label", DataType::Utf8, true),
])),
primary_key: vec!["id".to_string()],
};
engine.register_table(schema).await?;
// 4. Write via OLTP (auto-routed by SqlParserRouter)
engine.execute("INSERT INTO events VALUES (1, 'hello')", &[]).await?;
// 5. Sync CDC events to OLAP
engine.sync_now().await?;
// 6. Query via OLAP (aggregate → routed to DataFusion)
let batches = engine.query("SELECT COUNT(*) FROM events").await?;
println!("{} batches returned", batches.len());
// 7. Shut down cleanly
engine.shutdown().await;
Ok(())
}§Key types
HtapEngine— the main engine facade; start here.HtapConfig— builder-style configuration.- [
SidecarConfig] — sidecar-specific settings (watermark persistence, delete detection, OLTP toggle). CdcSource— unified CDC consumer wrapper used internally by the sync engine.
Re-exports§
pub use error::HtapError;
Modules§
- error
- Error types for the top-level HTAP engine facade.
Structs§
- CdcEvent
- A single CDC event captured from the OLTP engine.
- CdcSync
Engine - CDC-based sync engine that replicates changes from OLTP to OLAP.
- Data
Fusion Engine - DataFusion-backed OLAP engine.
- Heuristic
Router - Backwards-compatible query router that delegates to
SqlParserRouter. - Htap
Config - Configuration for the
HtapEngine. - Htap
Engine - Main HTAP engine facade.
- Rusqlite
CdcProducer - Reads CDC events from the
_rhei_cdc_logtable in the OLTP database. - Rusqlite
Engine - Rusqlite-backed OLTP engine.
- Schema
Registry - Thread-safe registry of table schemas.
- Shared
Data Fusion Engine - A cheaply-cloneable,
Arc-wrappedDataFusionEnginethat implementsrhei_core::OlapEngine. - SqlParser
Router - SQL parser-based query router that classifies SQL using a real AST.
- Sync
Result - Summary of a single
crate::SyncEngine::sync_oncecycle. - Sync
Status - A point-in-time snapshot of the
crate::SyncEngine’s replication state. - Table
Schema - Schema definition for a table tracked by the HTAP engine.
Enums§
- CdcOperation
- The type of a single CDC (Change Data Capture) operation.
- CdcSource
- When the
sidecarfeature is disabled, CdcSource wraps the Rusqlite CDC producer. - Olap
Backend - Unified OLAP backend that wraps all supported engines.
- Olap
Backend Type - Selects the OLAP query engine used by the
HtapEngine. - Olap
Engine Type - Discriminant for the active OLAP engine backend.
- Olap
Error - Error type returned by
crate::OlapBackendmethod implementations. - Oltp
Backend - Unified OLTP backend enum.
- Oltp
Error - Error type returned by
crate::OltpBackendmethod implementations. - Query
Hint - Optional caller hint to override automatic SQL routing.
- Query
Target - The engine that should execute a given SQL statement.
- Rusqlite
Oltp Error - Unified error type for the Rusqlite OLTP engine.
- Storage
Mode - Storage mode for the DataFusion OLAP engine.
- Sync
Mode - Controls how the
crate::SyncEngineapplies CDC events to the OLAP engine.
Traits§
- CdcConsumer
- Consumes CDC (Change Data Capture) events from the OLTP engine.
- Olap
Engine - Abstraction over an OLAP (analytical) query engine.
- Oltp
Engine - Abstraction over the OLTP (transactional) engine.
- Query
Router - Classifies SQL statements and routes them to the correct engine.
- Sync
Engine - Applies CDC events from the OLTP engine to the OLAP engine.
Functions§
- spawn_
sync_ loop - Spawn a background task that continuously polls CDC and syncs to OLAP.
- temporalize_
schema - Extend an Arrow schema with temporal columns for SCD Type 2.