Skip to main content

rhei_duckdb/
lib.rs

1//! DuckDB OLAP backend for the Rhei HTAP engine.
2//!
3//! This crate provides [`DuckDbEngine`], an implementation of
4//! [`rhei_core::OlapEngine`] backed by DuckDB. It is feature-gated at the
5//! workspace level by `duckdb-backend`.
6//!
7//! # Position in the HTAP stack
8//!
9//! ```text
10//! HtapEngine (rhei)
11//!   └── OlapBackend (rhei-olap)
12//!         └── DuckDbEngine  ← this crate
13//! ```
14//!
15//! # Thread-safety model and `unsafe impl Send + Sync`
16//!
17//! `duckdb::Connection` is marked `!Send` by the Rust binding because the
18//! binding author did not audit the thread-safety of every internal pointer
19//! held by the underlying DuckDB C++ object. However, DuckDB itself is safe
20//! to access from multiple threads **as long as each connection is accessed by
21//! only one thread
22//! at a time**. [`DuckDbEngine`] enforces this invariant by wrapping every
23//! connection in a `std::sync::Mutex`; no code in this crate ever touches a
24//! connection outside a `Mutex` guard.
25//!
26//! Because the `Mutex` provides the required exclusivity guarantee and all
27//! connection access is confined to `tokio::task::spawn_blocking` closures
28//! (which run on a thread-pool thread, not the async executor thread),
29//! `DuckDbEngine` and `SharedDuckDbEngine` implement `Send` and `Sync` via
30//! `unsafe impl`. This is the **only place in the Rhei workspace** that uses
31//! `unsafe impl` for a trait — every other crate is `#![forbid(unsafe_code)]`.
32//!
33//! # Connection pool layout
34//!
35//! | Connection | Count | Purpose |
36//! |---|---|---|
37//! | `write_conn` | 1 | DDL + DML (INSERT/UPDATE/DELETE/CREATE TABLE …) |
38//! | `read_pool` | N (default 4) | Concurrent SELECT via round-robin |
39//!
40//! All connections in the read pool are obtained with `Connection::try_clone()`
41//! from the initial write connection, so they share the same underlying DuckDB
42//! database and benefit from DuckDB's MVCC for concurrent reads.
43//!
44//! # Arrow ingestion path
45//!
46//! `DuckDbEngine::load_arrow` uses DuckDB's native `Appender` API
47//! (`Appender::append_record_batch`). This is a zero-copy path that preserves
48//! all Arrow types — `Boolean`, integers, floats, `Utf8`/`LargeUtf8`,
49//! `Binary`/`LargeBinary`, `Date32`/`Date64`, `Timestamp`, `Decimal128`, …
50//! — without serializing values to SQL literals. The `DataFusionEngine` in
51//! `rhei-datafusion` implements true streaming; the DuckDB engine falls back to
52//! collect-then-stream for `query_stream`.
53//!
54//! # Key types
55//!
56//! * [`DuckDbEngine`] — the core engine (single write conn + read pool).
57//! * [`SharedDuckDbEngine`] — `Arc`-wrapped newtype for shared ownership.
58//! * [`DuckDbError`] — unified error type for this crate.
59
60pub mod engine;
61pub mod error;
62
63pub use engine::{DuckDbEngine, SharedDuckDbEngine};
64pub use error::DuckDbError;