tree-sitter-postgres
A tree-sitter grammar for PostgreSQL, generated directly from PostgreSQL's Bison grammar (gram.y) and keyword list (kwlist.h).
Features
- Current as of PostgreSQL 18 (generated from REL_18_3)
- 727 grammar rules covering the full PostgreSQL SQL syntax
- 494 case-insensitive keywords across all four PG keyword categories
- Correct operator precedence —
1 + 2 * 3parses as1 + (2 * 3) - PL/pgSQL support via a separate grammar with language injection
- Generated, not hand-written — regenerate for any PostgreSQL version
Quick start
&& &&
Regenerating from PostgreSQL source
The grammar is generated from a local PostgreSQL checkout. Set PG_SOURCE_DIR to point at your PostgreSQL source tree:
# Using just (recommended)
# Or run the script directly
&&
Input files
| File | Source |
|---|---|
src/backend/parser/gram.y |
Bison grammar (733 rules, 3236 alternatives) |
src/include/parser/kwlist.h |
Keyword definitions (494 keywords) |
Generator scripts
| Script | Purpose |
|---|---|
script/generate-grammar.js |
Orchestrator — reads PG source, writes postgres/grammar.js |
script/parse-gram-y.js |
Parses Bison grammar: rules, terminals, precedence, %prec annotations |
script/parse-kwlist.js |
Parses keyword list into categories |
script/codegen.js |
Generates tree-sitter grammar with precedence and optional-rule handling |
postgres/harvest-conflicts.sh |
Iteratively discovers GLR conflicts needed by tree-sitter |
Repository structure
postgres/ PostgreSQL SQL grammar
grammar.js Generated tree-sitter grammar
src/ Generated parser (C)
test/corpus/ Test cases (35 tests)
known-conflicts.json GLR conflict pairs
plpgsql/ PL/pgSQL grammar
grammar.js Hand-written tree-sitter grammar
src/scanner.c External scanner for dollar-quoting and keywords
test/corpus/ Test cases
queries/ Highlights and injection queries
script/ Shared generator code
generate-grammar.js SQL grammar orchestrator
parse-gram-y.js Bison parser
parse-kwlist.js Keyword parser
codegen.js Grammar code generator
bindings/ Language bindings (Node, Rust, Python, Go, Swift, C)
Design notes
Empty rule handling
Bison's /* EMPTY */ alternatives cannot be directly translated — tree-sitter forbids non-start rules that match the empty string. The generator propagates optionality upward via a fixpoint loop and wraps references with optional() at call sites.
Operator precedence
Binary operators are split into a separate a_expr_prec rule resolved by static precedence (no GLR), while complex patterns (IS, IN, BETWEEN, LIKE, subquery operators) stay in a_expr with GLR conflict resolution. Both prec.left/prec.right (generation-time) and prec.dynamic (runtime) are emitted.
PL/pgSQL
PL/pgSQL is implemented as a separate hand-written grammar in plpgsql/ with an external scanner for dollar-quoting and context-sensitive keywords. SQL expressions and statements within PL/pgSQL blocks are delegated to the postgres grammar via tree-sitter language injection (plpgsql/queries/injections.scm).
License
BSD 3-Clause