Module parser

Expand description

SQL Parser – recursive-descent parser that converts a token stream into an AST.

The central type is Parser, which consumes tokens produced by the Tokenizer and builds a tree of Expression nodes covering the full SQL grammar: queries, DML, DDL, set operations, window functions, CTEs, and dialect-specific extensions for 30+ databases.

The simplest entry point is Parser::parse_sql, which tokenizes and parses a SQL string in one call.

§Static configuration maps

This module also exports several LazyLock<HashSet<TokenType>> constants (ported from Python sqlglot’s parser.py) that classify token types:

TYPE_TOKENS – all tokens that represent SQL data types
NESTED_TYPE_TOKENS – parametric types like ARRAY, MAP, STRUCT
RESERVED_TOKENS – tokens that cannot be used as unquoted identifiers
NO_PAREN_FUNCTIONS / NO_PAREN_FUNCTION_NAMES – zero-argument functions that may be written without parentheses (e.g. CURRENT_DATE)
DB_CREATABLES – object kinds valid after CREATE (TABLE, VIEW, etc.)
SUBQUERY_PREDICATES – tokens introducing subquery predicates (ANY, ALL, EXISTS)

Structs§

Parser: Recursive-descent SQL parser that converts a token stream into an AST.
ParserConfig: Configuration for the SQL Parser.

Statics§

AGGREGATE_TYPE_TOKENS: AGGREGATE_TYPE_TOKENS: Tokens for aggregate function types (ClickHouse) Python: AGGREGATE_TYPE_TOKENS = {TokenType.AGGREGATEFUNCTION, …}
DB_CREATABLES: DB_CREATABLES: Object types that can be created with CREATE Python: DB_CREATABLES = {TokenType.DATABASE, TokenType.SCHEMA, …}
ENUM_TYPE_TOKENS: ENUM_TYPE_TOKENS: Tokens that represent enum types Python: ENUM_TYPE_TOKENS = {TokenType.DYNAMIC, TokenType.ENUM, …}
NESTED_TYPE_TOKENS: NESTED_TYPE_TOKENS: Tokens that can have nested type parameters Python: NESTED_TYPE_TOKENS = {TokenType.ARRAY, TokenType.LIST, …}
NO_PAREN_FUNCTIONS: NO_PAREN_FUNCTIONS: Functions that can be called without parentheses Maps TokenType to the function name for generation Python: NO_PAREN_FUNCTIONS = {TokenType.CURRENT_DATE: exp.CurrentDate, …}
NO_PAREN_FUNCTION_NAMES: NO_PAREN_FUNCTION_NAMES: String names that can be no-paren functions These are often tokenized as Var/Identifier instead of specific TokenTypes
RESERVED_TOKENS: RESERVED_TOKENS: Tokens that cannot be used as identifiers without quoting These are typically structural keywords that affect query parsing
SIGNED_TO_UNSIGNED_TYPE_TOKEN: SIGNED_TO_UNSIGNED_TYPE_TOKEN: Maps signed types to unsigned types Python: SIGNED_TO_UNSIGNED_TYPE_TOKEN = {TokenType.BIGINT: TokenType.UBIGINT, …}
STRUCT_TYPE_TOKENS: STRUCT_TYPE_TOKENS: Tokens that represent struct-like types Python: STRUCT_TYPE_TOKENS = {TokenType.FILE, TokenType.NESTED, TokenType.OBJECT, …}
SUBQUERY_PREDICATES: SUBQUERY_PREDICATES: Tokens that introduce subquery predicates Python: SUBQUERY_PREDICATES = {TokenType.ANY: exp.Any, …}
TYPE_TOKENS: TYPE_TOKENS: All tokens that represent data types Python: TYPE_TOKENS = {TokenType.BIT, TokenType.BOOLEAN, …}

Module parser

Module parser Copy item path

§Static configuration maps

Structs§

Statics§

Module parser