rhei-duckdb 1.5.0

DuckDB OLAP backend for Rhei HTAP engine
Documentation

DuckDB OLAP backend for the Rhei HTAP engine.

This crate provides [DuckDbEngine], an implementation of [rhei_core::OlapEngine] backed by DuckDB. It is feature-gated at the workspace level by duckdb-backend.

Position in the HTAP stack

HtapEngine (rhei)
  └── OlapBackend (rhei-olap)
        └── DuckDbEngine  ← this crate

Thread-safety model and unsafe impl Send + Sync

duckdb::Connection is marked !Send by the Rust binding because the binding author did not audit the thread-safety of every internal pointer held by the underlying DuckDB C++ object. However, DuckDB itself is safe to access from multiple threads as long as each connection is accessed by only one thread at a time. [DuckDbEngine] enforces this invariant by wrapping every connection in a std::sync::Mutex; no code in this crate ever touches a connection outside a Mutex guard.

Because the Mutex provides the required exclusivity guarantee and all connection access is confined to tokio::task::spawn_blocking closures (which run on a thread-pool thread, not the async executor thread), DuckDbEngine and SharedDuckDbEngine implement Send and Sync via unsafe impl. This is the only place in the Rhei workspace that uses unsafe impl for a trait — every other crate is #![forbid(unsafe_code)].

Connection pool layout

Connection Count Purpose
write_conn 1 DDL + DML (INSERT/UPDATE/DELETE/CREATE TABLE …)
read_pool N (default 4) Concurrent SELECT via round-robin

All connections in the read pool are obtained with Connection::try_clone() from the initial write connection, so they share the same underlying DuckDB database and benefit from DuckDB's MVCC for concurrent reads.

Arrow ingestion path

DuckDbEngine::load_arrow uses DuckDB's native Appender API (Appender::append_record_batch). This is a zero-copy path that preserves all Arrow types — Boolean, integers, floats, Utf8/LargeUtf8, Binary/LargeBinary, Date32/Date64, Timestamp, Decimal128, … — without serializing values to SQL literals. The DataFusionEngine in rhei-datafusion implements true streaming; the DuckDB engine falls back to collect-then-stream for query_stream.

Key types

  • [DuckDbEngine] — the core engine (single write conn + read pool).
  • [SharedDuckDbEngine] — Arc-wrapped newtype for shared ownership.
  • [DuckDbError] — unified error type for this crate.