polyglot-sql-function-catalogs
Optional dialect function-catalog data for polyglot-sql semantic validation.
This crate intentionally does not depend on polyglot-sql. It exposes feature-gated
function lists through a small sink interface so the core crate can pull them in
at compile time without dependency cycles.
Features
dialect-clickhouse: include ClickHouse function signatures.dialect-duckdb: include DuckDB function signatures.all-dialects: currently aliases all available dialect features.
Performance Model
- This crate emits static function lists into a caller-provided sink.
- No runtime globals are created in this crate.
- Compile-time feature flags decide which dialect lists are built.
- In
polyglot-sql, enabled lists are loaded once into aLazyLockcatalog.
Usage
With polyglot-sql compile-time wiring (recommended):
[]
= { = "...", = ["function-catalog-clickhouse"] }
Then run schema validation with type checks:
use ;
let schema = ValidationSchema ;
let options = SchemaValidationOptions ;
let result = validate_with_schema;
assert!;
The core crate auto-injects embedded catalogs when:
check_typesis enabled, and- no custom
function_catalogis provided in options.
What Catalog Validation Checks
Function catalogs are used during schema/type validation (not parser syntax validation).
When check_types is enabled in SchemaValidationOptions, core validation uses the catalog to check:
- Function name presence per dialect:
- Unknown function name ->
E202(E_UNKNOWN_FUNCTION)
- Unknown function name ->
- Function arity / overloads:
- Name exists but argument count matches no signature ->
E203(E_INVALID_FUNCTION_ARITY)
- Name exists but argument count matches no signature ->
Catalog entries currently define:
min_aritymax_arity(Nonemeans variadic)- multiple overloads per function name
- dialect-level casing behavior + optional per-function casing override
Catalog entries do not currently define:
- per-argument data types
- return types
- coercion rules
- named/optional parameter semantics
Advanced manual integration (custom sink) is also available via:
CatalogSinkregister_enabled_catalogsFunctionSignatureFunctionNameCase
Feature Mapping Across Crates
This crate's features are consumed by polyglot-sql compile-time features:
polyglot-sql/function-catalog-clickhouse->polyglot-sql-function-catalogs/dialect-clickhousepolyglot-sql/function-catalog-duckdb->polyglot-sql-function-catalogs/dialect-duckdbpolyglot-sql/function-catalog-all-dialects->polyglot-sql-function-catalogs/all-dialects
Bindings forward those same core features:
polyglot-sql-wasm/function-catalog-clickhousepolyglot-sql-wasm/function-catalog-duckdbpolyglot-sql-wasm/function-catalog-all-dialectspolyglot-sql-ffi/function-catalog-clickhousepolyglot-sql-ffi/function-catalog-duckdbpolyglot-sql-ffi/function-catalog-all-dialectspolyglot-sql-python/function-catalog-clickhousepolyglot-sql-python/function-catalog-duckdbpolyglot-sql-python/function-catalog-all-dialects
Build examples:
Adding A New Dialect Function List
This crate is intentionally feature-gated so large function datasets do not inflate every binary.
1. Add a new dialect source file
Create src/<dialect>.rs and expose register<S: CatalogSink>(sink: &mut S).
use crate::;
pub
2. Wire the module in src/lib.rs
3. Add feature flags in Cargo.toml
[]
= []
= []
= []
= ["dialect-clickhouse", "dialect-duckdb"]
4. (Optional) Add extraction tooling
Put source-specific extraction scripts in:
tools/<dialect>/extract_functions.py
Current example path:
tools/clickhouse/extract_functions.pytools/duckdb/extract_functions.py
ClickHouse extraction command (uses chdb via uv run, no local install required):
DuckDB extraction command (requires Python package duckdb, installed on demand):
Useful optional flags:
--exclude-internal: drop rows whereinternal = true--function-type <type>(repeatable): filter to specificfunction_typevalues
How This Crate Is Wired To Other Crates
polyglot-sql (core)
- Core has an optional dependency on this crate.
- Core features:
function-catalog-clickhousefunction-catalog-all-dialects
- When one of these features is enabled, core builds an embedded catalog once and auto-uses it for schema validation type checks (unless caller provides a custom catalog).
- Runtime override hook still exists via
SchemaValidationOptions.function_catalog.
polyglot-sql-wasm
- WASM forwards function-catalog features to
polyglot-sql:function-catalog-clickhousefunction-catalog-all-dialects
- Uses core schema validation, so catalog checks apply when
check_typesis enabled. - If disabled, behavior is unchanged.
Other bindings (ffi, python, sdk)
ffiandpythoncan enable core catalog features at compile time via pass-through features.- Today, their exposed
validateAPIs are syntax-only, so catalog checks are effectively dormant until schema-aware validation APIs are exposed there. sdkconsumes WASM and therefore follows WASM feature behavior.- No direct dependency on this crate is required in those bindings.