1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
//! CDC sync engine, SQL query router, and CDC-to-DML converter for Rhei.
//!
//! `rhei-sync` is the **glue crate** that ties the OLTP and OLAP halves of
//! Rhei's HTAP pipeline together. It has three core responsibilities:
//!
//! ## 1. Query routing ([`router`])
//!
//! [`SqlParserRouter`] parses every incoming SQL statement with
//! `sqlparser-rs` (SQLite dialect) and directs it to the right backend:
//!
//! - **OLTP** — writes (INSERT / UPDATE / DELETE), DDL, transactions, and
//! simple point-lookup SELECTs.
//! - **OLAP** — aggregates, GROUP BY / HAVING, JOINs, window functions, CTEs,
//! set operations (UNION / INTERSECT / EXCEPT), and subqueries.
//! - **Heuristic fallback** — when the parser cannot handle the SQL (e.g.
//! SQLite `PRAGMA`), a keyword-scan fallback is applied. The default is
//! OLTP (safety-first).
//!
//! ## 2. CDC-to-DML conversion ([`converter`], [`temporal_converter`])
//!
//! CDC events (`INSERT` / `UPDATE` / `DELETE`) captured from OLTP triggers are
//! converted into OLAP DML:
//!
//! - **Destructive mode** — mirror semantics: UPDATE overwrites the existing
//! row, DELETE removes it.
//! - **Temporal mode (SCD Type 2)** — append-only with validity intervals.
//! Every change produces an INSERT; UPDATEs and DELETEs additionally close
//! the previous row version by setting `_rhei_valid_to`. Point-in-time
//! queries use:
//! `WHERE _rhei_valid_from <= T AND (_rhei_valid_to IS NULL OR _rhei_valid_to > T)`
//!
//! ### Arrow-native bulk INSERT path
//!
//! Consecutive INSERT events for the same table are grouped into a
//! `SyncOp::BatchInsert` (see [`sync_engine`]). The engine first attempts to
//! build a typed Arrow `RecordBatch` via [`converter::cdc_events_to_batch`] (or
//! [`temporal_converter::cdc_events_to_temporal_batch`] in temporal mode) and
//! calls `OlapEngine::load_arrow`, bypassing SQL parsing entirely. If the
//! schema contains a type that the Arrow builder does not handle (Date,
//! Timestamp, Decimal, List, Struct, …) the engine falls back to a SQL
//! `INSERT … VALUES` statement.
//!
//! ## 3. Background sync loop ([`background`])
//!
//! [`spawn_sync_loop`] starts a Tokio task that calls
//! [`rhei_core::SyncEngine::sync_once`] on a configurable interval. Shutdown
//! is cooperative: the caller holds a
//! [`tokio_util::sync::CancellationToken`] and cancels it to stop the loop
//! cleanly.
//!
//! A DDL read-lock is acquired before every cycle so that schema mutations
//! (which take the write lock) never interleave with an in-progress sync.
//!
//! ## `SyncMode` semantics
//!
//! | Mode | UPDATE | DELETE |
//! |------|--------|--------|
//! | `Destructive` | overwrites the row | removes the row |
//! | `Temporal` | closes previous version + inserts new version | closes previous version + inserts tombstone |
//!
//! ## SQL injection prevention
//!
//! All table and column identifiers are validated against `[A-Za-z0-9_]` at
//! `register_table` time. The converter re-validates them at conversion time
//! for defense-in-depth.
pub use spawn_sync_loop;
pub use SyncError;
pub use ;
pub use CdcSyncEngine;
pub use temporalize_schema;