Expand description
DataFusion OLAP backend for the Rhei HTAP engine.
§Architecture position
rhei-datafusion is the default OLAP backend in Rhei (enabled by the
datafusion-backend workspace feature, which is on by default). It
implements rhei_core::OlapEngine for Apache DataFusion 53 and is used
by the rhei facade’s OlapBackend enum.
§Storage modes
The engine is parameterised by a StorageMode chosen at construction time:
| Variant | Durability | Notes |
|---|---|---|
StorageMode::InMemory | None — lost on shutdown | Fastest; ideal for tests and ephemeral workloads |
StorageMode::Vortex | Durable — .vortex files (local or S3) | Auto-detects local vs S3 from URL scheme |
§Streaming queries
rhei_core::OlapEngine::query_stream on DataFusionEngine returns a
rhei_core::RecordBatchBoxStream backed by DataFusion’s own
DataFrame::execute_stream(). A thin StreamAdapter newtype adapts
SendableRecordBatchStream to the common RecordBatchBoxStream type
without buffering — results flow row-batch by row-batch directly to the
caller.
§No transactions
DataFusion does not support SQL transactions.
rhei_core::OlapEngine::supports_transactions returns false for this
backend. The Rhei sync engine handles partial-failure recovery via CDC
sequence numbers instead of relying on BEGIN/COMMIT.
§DML strategy
InMemory: mutations update an in-memory HashMap of Vec<RecordBatch>
and re-register a MemTable with DataFusion after each change.
Vortex: INSERT routes through DataFusion’s VortexFormatFactory sink (SQL
INSERT INTO … SELECT * FROM tmp). UPDATE/DELETE use a read-modify-write
cycle: read all data via SELECT *, apply mutations in-memory, clear the
table directory, and re-insert.
§Cloud storage
S3-compatible backends are gated behind the cloud-storage workspace feature.
When enabled, object_store (with aws sub-feature) is pulled in.
Credentials are resolved from the environment at engine construction time
following object_store conventions (e.g. AWS_ACCESS_KEY_ID).
S3-compatible services (MinIO, Cloudflare R2, Wasabi, Ceph RGW) work via
AWS_ENDPOINT_URL.
§Feature flags
cloud-storage— enablesStorageMode::Vortexwiths3://URLs via theobject_storecrate.
Re-exports§
pub use engine::DataFusionEngine;pub use error::DfOlapError;pub use storage::StorageMode;