Skip to main content

Module cst

Module cst 

Source
Expand description

Lossless concrete syntax tree (CST) for Beancount.

Phase 1 of the parser-CST migration tracked in #1262. Sits inside rustledger-parser (no new crate) — phases 2-5 will move the existing AST-style parser internals to delegate to this module and eventually delete the old code paths.

§Phase 1 surface

§Trivia attachment policy (phase 2.0)

Phase 1 emits a flat tree, where trivia attachment is a non- question. Phase 2.1+ introduces structural nodes (DIRECTIVE, then POSTING / AMOUNT / COST_SPEC / META_ENTRY / …) that wrap token runs. Phase 2.0 pins the Directive-Terminator Rule: every directive owns its content tokens PLUS its terminating NEWLINE.

Short version:

  • Same-line trailing trivia (whitespace + EOL comment before the terminator) lives INSIDE the directive.
  • Inter-directive leading trivia (blank lines, mid-file comment blocks) lives INSIDE the NEXT directive.
  • File-leading trivia (before the first content token) is a direct child of SOURCE_FILE.
  • File-trailing trivia (after the file-final directive’s terminator) is also a direct child of SOURCE_FILE.

Fully symmetric: every directive has the same children shape (optional leading + content + optional same-line trailing + terminator NEWLINE). No EOF special case.

Phase 2.0 ships NO production helper — the policy is enforced via tree-shape regression tests in cst::trivia (private submodule). Phase 2.1’s structured parser writes its own streaming, state-aware predicate that produces trees matching those shapes. If the parser drifts, the regression tests fire. See the trivia module rustdoc for the full spec, rationale, and recursive-application notes for phase 2.1’s grammar.

Modules§

ast
Typed AST wrappers over the lossless CST.

Enums§

BeancountLanguage
Tag enum for rowan::Language. Zero variants — only used as a type-level marker.
SyntaxKind
Every kind of token or node that can appear in a Beancount CST.

Functions§

lossless_kind_tokens
Tokenize source losslessly and emit (SyntaxKind, Range) entries covering every byte exactly once.
parse_flat
Parse source to a flat lossless CST.
parse_structured
Parse source to a structured lossless CST.
parse_via_cst
Parse Beancount source via the CST and produce the AST-shaped ParseResult. This is the implementation behind crate::parse; the public entry delegates here unconditionally.

Type Aliases§

SyntaxElement
rowan::SyntaxElement (token-or-node) specialized to BeancountLanguage.
SyntaxNode
rowan::SyntaxNode specialized to BeancountLanguage.
SyntaxToken
rowan::SyntaxToken specialized to BeancountLanguage.