Skip to main content

Module lexer

Module lexer 

Source
Expand description

Lexer for the PG-dialect subset that SPG accepts.

v0.2 token stream is value-only — no source spans yet. Errors do report the byte offset where the offending construct started. Identifiers are ASCII case-folded to lower-case (matches PG when un-quoted). Quoted identifiers ("...") preserve case; "" is an embedded quote. String literals ('...') follow PG single-quote convention with '' as the embedded quote. The lexer accepts but does not interpret E-strings or dollar-quoted strings — those land in a later milestone.

Structs§

LexError

Enums§

LexErrorKind
Token

Functions§

tokenize
Tokenize input into a Vec<Token> ending in Token::Eof, with PG string semantics (backslash is a literal byte inside '…'; '' is the only escape).
tokenize_with
v7.22 (round-13 T3) — dialect-aware tokenizer entry. With backslash_escapes = true, plain '…' strings honour MySQL / pre-9.1-PG backslash escapes (\' \\ \n …, the same decode the E'…' form uses). mysqldump ALWAYS emits \'-escaped data sections, and pg_dump ALWAYS announces PG semantics via SET standard_conforming_strings = on — the engine flips this flag off/on from those deterministic session signals.