Skip to main content

ParseResult

Struct ParseResult 

Source
#[non_exhaustive]
pub struct ParseResult { pub directives: Vec<Spanned<Directive>>, pub options: Vec<(String, String, Span)>, pub includes: Vec<(String, Span)>, pub plugins: Vec<(String, Option<String>, Span)>, pub comments: Vec<Spanned<String>>, pub errors: Vec<ParseError>, pub warnings: Vec<ParseWarning>, pub currency_occurrences: Vec<Spanned<Currency>>, pub account_occurrences: Vec<Spanned<Account>>, pub has_leading_bom: bool, pub syntax_root: GreenNode, pub alignment: PostingAlignment, }
Expand description

Result of parsing a beancount file.

Marked #[non_exhaustive] so external consumers must go through parse rather than constructing the struct by literal. Future field additions (e.g., diagnostic metadata, source-map back- references) then land as non-breaking changes.

Fields (Non-exhaustive)§

This struct is marked as non-exhaustive
Non-exhaustive structs could have additional fields added in future. Therefore, non-exhaustive structs cannot be constructed in external crates using the traditional Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.
§directives: Vec<Spanned<Directive>>

Successfully parsed directives.

§options: Vec<(String, String, Span)>

Options found in the file.

§includes: Vec<(String, Span)>

Include directives found.

§plugins: Vec<(String, Option<String>, Span)>

Plugin directives found.

§comments: Vec<Spanned<String>>

Standalone comments found in the file.

§errors: Vec<ParseError>

Parse errors encountered.

§warnings: Vec<ParseWarning>

Deprecation warnings.

§currency_occurrences: Vec<Spanned<Currency>>

Every Currency token the parser consumed, paired with its interned value and source-byte range.

Source-position-aware tooling (LSP rename / references / document-highlight) walks this list to produce edits, locations, and highlights without resorting to string search of the source, which produces false positives in comments, payee strings, account-name segments, etc. The order matches source order because the parser fills it as tokens are consumed (and the parser is strictly forward-advancing, including on error recovery).

Error-recovery contract. Tokens consumed during a directive that ultimately fails to parse remain in this list. Rationale: the lexer’s classification of a token as a Currency is independent of whether the surrounding syntax is valid, and tooling that wants to rename or highlight a currency the user typed should follow that classification. Do not “clean up” partially-consumed entries after a parse failure - that would hide real currency identifiers from downstream tooling while the user is mid-edit.

file_id is always 0 in parser output. The parser processes one file at a time and doesn’t know its own file id. The loader sets the correct id on each entry via .with_file_id(n) when assembling a multi-file SourceMap, the same way it does for directives. Per-file consumers (today: every LSP handler) can ignore file_id; future multi-file consumers must remember to thread it through.

§account_occurrences: Vec<Spanned<Account>>

Every Account token the parser consumed, paired with its interned value and source-byte range.

Mirrors Self::currency_occurrences for the account shape. The CST conversion (walk_descendants_once) tracks every ACCOUNT token whose ancestors do NOT include an ERROR_NODE. The LSP rename handler (phase 5.4) walks this list to emit exact-span edits without resorting to per-directive substring search, which used to produce false positives wherever an account-name fragment appeared inside a payee string, a STRING-typed metadata value, or a comment. ACCOUNT-typed metadata values (e.g. counterparty: Assets:Bank) DO produce an ACCOUNT token at the lexer level and ARE included in this list - so a rename of Assets:Bank correctly rewrites that metadata value too.

Migration status (#1262 phase 5.4). Only the LSP rename handler currently consumes this index. The sibling handlers references, document_highlight, and linked_editing still walk the typed AST with substring search for accounts (see those modules’ rustdoc); migrating them to consume account_occurrences is tracked as a phase 5.5+ follow-up.

Error-recovery contract. Two notions of “failing directive” need to be distinguished:

  • A directive that PARSES SYNTACTICALLY but whose typed-AST conversion errors (e.g., crate::ParseErrorKind::InvalidBookingMethod on an open Assets:Bank "GARBAGE"). The ACCOUNT node is intact in the CST and NOT inside an ERROR_NODE. The token IS tracked - tooling can still rename it during the mid-edit state.
  • A directive so garbled that the CST wraps the region in an ERROR_NODE. The ACCOUNT token is inside an ERROR_NODE and is NOT tracked. This is deliberate - the recovery boundary is fuzzy and including such tokens would surface as confusing rename hits inside garbage source.

§Limitations

The list is undifferentiated: declarations (from open/close/balance/pad/note/document) and references (from posting accounts and ACCOUNT-typed metadata) are mixed together. There is no equivalent of the commodity_declaration_spans helper used for currencies (the account case has six declaration directive shapes vs. the single Commodity shape, so no symmetric helper exists yet). A future go-to-definition migration will need either a re-walk over directives or an additional account_declarations: Vec<Span> field.

file_id is always 0 in parser output - same loader contract as currency_occurrences.

§has_leading_bom: bool

true iff the parsed source began with a UTF-8 BOM (strict byte 0).

This is the single source of truth for downstream consumers that need to know whether to preserve a leading BOM on output (notably format_source). Do NOT inspect the source bytes directly; the parser already handled the strip/detect logic in one place (crate::bom::strip_leading) and stored the result here. Reproducing the check elsewhere is exactly the contract- drift class of bug this field was introduced to eliminate.

Span coordinates in this ParseResult are in the original source frame - i.e., if has_leading_bom is true, spans already include the 3-byte BOM offset and index directly into the caller’s source.

§syntax_root: GreenNode

The lossless CST root the converter walked to produce everything above. Stored as a rowan::GreenNode, which is Send + Sync and reference-counted internally, so an Arc<ParseResult> (the shape the LSP caches per document) shares this handle across handler invocations without re-parsing.

Prefer Self::syntax_node over reading this field directly. The method is the supported entry point: it returns a SyntaxNode (the cursor-API view), keeps the rowan::GreenNode type name out of consumer code, and shields callers from minor rowan upgrades that touch the GreenNode shape. The field is public for two reasons — the exhaustive destructure in [__baseline_canonical_payload] needs to bind it, and Arc::clone-style sharing patterns benefit from direct access — but downstream code should reach for the method.

Byte-offset frame: post-BOM. The CST is built from the BOM-stripped source — the parser strips a strict- byte-0 UTF-8 BOM (see crate::bom::strip_leading) and feeds the stripped slice to parse_structured. So every TextRange / TextSize reachable through this tree is in the post-BOM byte frame: an offset of 0 here corresponds to byte BOM_LEN == 3 of the original source when Self::has_leading_bom is true. This differs from the typed-AST fields above (Self::directives, Self::currency_occurrences, Self::account_occurrences, Self::errors, …), whose spans the converter pre-shifts back into the original-source frame so downstream consumers can index directly into the caller’s source bytes. CST-walking consumers must apply the equivalent shift themselves: subtract BOM_LEN when translating an original-source offset down to a CST offset (e.g., cst.token_at_offset(orig - BOM_LEN)), and add BOM_LEN back when emitting an original-source position from a TextRange. The LSP selection_range handler does this — see its rustdoc and the bom_prefixed_source_does_not_shift_ranges regression test.

Canonical-payload exclusion. This field is deliberately NOT fed into [__baseline_canonical_payload]. The green node is a redundant cache of the source bytes; the existing directives / currency_occurrences / account_occurrences / errors fields already capture everything downstream consumers track for drift detection. Adding the green node’s Debug output would multiply the fingerprint size without surfacing any new drift signal. The corresponding assert_field_in_hash arm is also intentionally absent in tests/corpus_baseline.rs. A negative-form test (__canonical_payload_excludes_syntax_root in this file) pins the exclusion: it confirms that mutating syntax_root while every other field is equal does NOT change the canonical payload bytes.

§alignment: PostingAlignment

File-wide alignment columns the formatter would use for this source — pre-computed at parse time so hot formatting paths skip the O(N_postings) per-call walk.

PostingAlignment is Copy; pass it directly into the _with_alignment variants of the formatter (crate::format::format_node_with_alignment, crate::format::format_node_range_with_alignment, crate::format::format_source_with_parsed) to reuse this cached value. The LSP format_document / range_formatting fallback handlers, the FFI format.source endpoint, and the WASM ParsedLedger::format bridge all consume the cache to skip both the redundant parse and the redundant alignment walk.

Producer-only cache invariant. This field is populated exactly once by parse_via_cst; the value is consistent with the directives / syntax_root fields at parse time. ParseResult exposes every cache input (directives, syntax_root) as pub, so technically a consumer with a &mut ParseResult can mutate one without refreshing the other — leaving alignment stale. That is OUT-OF-CONTRACT for this cache. Callers that mutate ParseResult directly must either (a) refresh alignment by calling crate::format::compute_alignment(&SourceFile::cast(self.syntax_node())), (b) avoid the _with_alignment formatter variants and use the bare ones (which re-compute), or (c) treat the ParseResult as immutable after construction (the common case — the LSP wraps it in Arc<ParseResult>).

Equivalence pinned. parse_result_alignment_cache::* (7 fixtures) assert that parse(s).alignment equals compute_alignment(&SourceFile::cast(parse(s).syntax_node()).unwrap()) across representative fixtures, so any future divergence (a converter change that forgets to refresh the cache, a compute_alignment change that breaks the contract) fails CI.

Canonical-payload exclusion. Excluded from [__baseline_canonical_payload] for the same reason as syntax_root: it’s a redundant derivation of directives content. Mutating it without changing directives would silently flip the corpus hash; including it in the payload would change the hash for every source with a non-default alignment (i.e. essentially every real Beancount file). The exclusion is pinned by canonical_payload_excludes_alignment.

Implementations§

Source§

impl ParseResult

Source

pub fn syntax_node(&self) -> SyntaxNode

Cursor-API view of the lossless CST that produced this ParseResult. Equivalent to SyntaxNode::new_root(self.syntax_root.clone()).

Construction is an Arc bump (the green node’s internal refcount); cheap enough to call per request. This is the supported entry point for CST consumers — prefer it over reading Self::syntax_root directly, so the rowan dependency stays an implementation detail.

Trait Implementations§

Source§

impl Debug for ParseResult

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> ArchivePointee for T

Source§

type ArchivedMetadata = ()

The archived version of the pointer metadata for this type.
Source§

fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata

Converts some archived metadata to the pointer metadata for itself.
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> LayoutRaw for T

Source§

fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>

Returns the layout of the type.
Source§

impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
where T: SharedNiching<N1, N2>, N1: Niching<T>, N2: Niching<T>,

Source§

unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool

Returns whether the given value has been niched. Read more
Source§

fn resolve_niched(out: Place<NichedOption<T, N1>>)

Writes data to out indicating that a T is niched.
Source§

impl<T> Pointee for T

Source§

type Metadata = ()

The metadata type for pointers and references to this type.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.