rhei-sync 1.5.0

CDC sync engine and query router for Rhei
Documentation
//! CDC sync engine, SQL query router, and CDC-to-DML converter for Rhei.
//!
//! `rhei-sync` is the **glue crate** that ties the OLTP and OLAP halves of
//! Rhei's HTAP pipeline together.  It has three core responsibilities:
//!
//! ## 1. Query routing ([`router`])
//!
//! [`SqlParserRouter`] parses every incoming SQL statement with
//! `sqlparser-rs` (SQLite dialect) and directs it to the right backend:
//!
//! - **OLTP** — writes (INSERT / UPDATE / DELETE), DDL, transactions, and
//!   simple point-lookup SELECTs.
//! - **OLAP** — aggregates, GROUP BY / HAVING, JOINs, window functions, CTEs,
//!   set operations (UNION / INTERSECT / EXCEPT), and subqueries.
//! - **Heuristic fallback** — when the parser cannot handle the SQL (e.g.
//!   SQLite `PRAGMA`), a keyword-scan fallback is applied.  The default is
//!   OLTP (safety-first).
//!
//! ## 2. CDC-to-DML conversion ([`converter`], [`temporal_converter`])
//!
//! CDC events (`INSERT` / `UPDATE` / `DELETE`) captured from OLTP triggers are
//! converted into OLAP DML:
//!
//! - **Destructive mode** — mirror semantics: UPDATE overwrites the existing
//!   row, DELETE removes it.
//! - **Temporal mode (SCD Type 2)** — append-only with validity intervals.
//!   Every change produces an INSERT; UPDATEs and DELETEs additionally close
//!   the previous row version by setting `_rhei_valid_to`.  Point-in-time
//!   queries use:
//!   `WHERE _rhei_valid_from <= T AND (_rhei_valid_to IS NULL OR _rhei_valid_to > T)`
//!
//! ### Arrow-native bulk INSERT path
//!
//! Consecutive INSERT events for the same table are grouped into a
//! `SyncOp::BatchInsert` (see [`sync_engine`]).  The engine first attempts to
//! build a typed Arrow `RecordBatch` via [`converter::cdc_events_to_batch`] (or
//! [`temporal_converter::cdc_events_to_temporal_batch`] in temporal mode) and
//! calls `OlapEngine::load_arrow`, bypassing SQL parsing entirely.  If the
//! schema contains a type that the Arrow builder does not handle (Date,
//! Timestamp, Decimal, List, Struct, …) the engine falls back to a SQL
//! `INSERT … VALUES` statement.
//!
//! ## 3. Background sync loop ([`background`])
//!
//! [`spawn_sync_loop`] starts a Tokio task that calls
//! [`rhei_core::SyncEngine::sync_once`] on a configurable interval.  Shutdown
//! is cooperative: the caller holds a
//! [`tokio_util::sync::CancellationToken`] and cancels it to stop the loop
//! cleanly.
//!
//! A DDL read-lock is acquired before every cycle so that schema mutations
//! (which take the write lock) never interleave with an in-progress sync.
//!
//! ## `SyncMode` semantics
//!
//! | Mode | UPDATE | DELETE |
//! |------|--------|--------|
//! | `Destructive` | overwrites the row | removes the row |
//! | `Temporal` | closes previous version + inserts new version | closes previous version + inserts tombstone |
//!
//! ## SQL injection prevention
//!
//! All table and column identifiers are validated against `[A-Za-z0-9_]` at
//! `register_table` time.  The converter re-validates them at conversion time
//! for defense-in-depth.

pub mod background;
pub mod converter;
pub mod error;
pub mod router;
pub mod sync_engine;
pub mod temporal_converter;

pub use background::spawn_sync_loop;
pub use error::SyncError;
pub use router::{HeuristicRouter, SqlParserRouter};
pub use sync_engine::CdcSyncEngine;
pub use temporal_converter::temporalize_schema;