Expand description
Parser infrastructure for the query language.
§Architecture
This parser produces a lossless concrete syntax tree (CST) via Rowan’s green tree builder. Key design decisions borrowed from rust-analyzer, rnix-parser, and taplo:
- Zero-copy parsing: tokens carry spans, text sliced only when building tree nodes
- Trivia buffering: whitespace/comments collected, then attached as leading trivia
- Checkpoint-based wrapping: retroactively wrap nodes for quantifiers
*+? - Explicit recovery sets: per-production sets determine when to bail vs consume diagnostics
§Recovery Strategy
The parser is resilient — it always produces a tree. Recovery follows these rules:
- Unknown tokens get wrapped in
SyntaxKind::Errornodes and consumed - Missing expected tokens emit a diagnostic but don’t consume (parent may handle)
- Recovery sets define “synchronization points” per production
- On recursion limit, remaining input goes into single Error node
However, fuel exhaustion (exec_fuel, recursion_fuel) returns an actual error immediately.
Modules§
- ast
- Typed AST wrappers over CST nodes.
Structs§
- AltExpr
- Anchor
- Anonymous
Node - Anonymous node: string literal (
"+") or wildcard (_). Maps from CSTStrorWildcard. - Branch
- Captured
Expr - Def
- Field
Expr - Named
Node - Negated
Field - Node
Predicate - Parse
Result - Parser
- Trivia tokens are buffered and flushed when starting a new node.
- Quantified
Expr - Ref
- Regex
Literal - Root
- SeqExpr
- Token
- Zero-copy token: kind + span, text retrieved via
token_textwhen needed. - Type
Enums§
- AltKind
- Whether an alternation uses tagged or untagged branches.
- Expr
- Expression: any pattern that can appear in the tree.
- Predicate
Op - Predicate operator for node text filtering.
- Predicate
Value - Predicate value: either a string or a regex pattern.
- SeqItem
- Either an expression or an anchor in a sequence.
- Syntax
Kind - All token and node kinds. Tokens first, then nodes, then
__LASTsentinel.#[repr(u16)]enables safe transmute inkind_from_raw.
Functions§
- is_
truly_ empty_ scope - Checks if expression is a truly empty scope (sequence/alternation with no children).
Used to distinguish
{ } @x(empty struct) from{(expr) @_} @x(Node capture). - lex
- Tokenizes source into a vector of span-based tokens.
- token_
src - Extracts token text with source lifetime.
- token_
text - Retrieves the text slice for a token. O(1) slice into source.
Type Aliases§
- Syntax
Node - Type aliases for Rowan types parameterized by our language.
- Syntax
Token