Expand description
Shared safety gate for the raw-SQL endpoint (POST /api/v1/sql).
Raw SQL is a much larger attack surface than the structured /query
endpoint, so every statement is parsed and validated before it is
handed to a backend engine. The same gate runs for DuckDB and
DataFusion, giving both backends identical safety semantics — and
keeping the “which tables may this query touch?” policy in one place.
Guarantees enforced by validate:
- exactly one statement, and it is a read-only
SELECT/WITH … SELECT, - every referenced table is a registered dataset — no file-reading
table functions (
read_parquet,read_csv, …), no unknown tables, - no file-reading scalar functions (
read_text,read_blob, …), - at most
max_datasetsdistinct datasets are referenced. Phase 1 passes1, enforcing the single-dataset rule; raising this bound is all that’s needed to allow cross-dataset joins later.
CTE-defined names are tracked per query scope and excluded from the
dataset allowlist check, so WITH t AS (SELECT … FROM events) SELECT …
is accepted (it still only touches events).
Structs§
- Validated
Sql - A validated, ready-to-execute SQL query.
Functions§
- canonicalize_
identifiers - Rewrite references to registered tables and their columns so they match case-insensitively, the way DuckDB does.
- validate
- Validate
sqlfor the raw-SQL endpoint.