1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
//! Parsing module for the lex format
//!
//! This module provides the complete processing pipeline from source text to AST:
//! 1. Lexing: Tokenization of source text. See [lexing](crate::lex::lexing) module.
//! 2. Analysis: Syntactic analysis to produce IR nodes. See [engine](engine) module.
//! 3. Building: Construction of AST from IR nodes. See [building](crate::lex::building) module.
//! 4. Inline Parsing: Parse inline elements in text content. See [inlines](crate::lex::inlines) module.
//! 5. Assembling: Post-parsing transformations. See [assembling](crate::lex::assembling) module.
//!
//! Parsing End To End
//!
//! The complete pipeline transforms a string of Lex source up to the final AST through
//! these stages:
//!
//! Lexing (5.1):
//! Tokenization and transformations that group tokens into lines. At the end of
//! lexing, we have a TokenStream of Line tokens + indent/dedent tokens.
//!
//! Parsing - Semantic Analysis (5.2):
//! At the very beginning of parsing we will group line tokens into a tree of
//! LineContainers. What this gives us is the ability to parse each level in isolation.
//! Because we don't need to know what a LineContainer has, but only that it is a
//! line container, we can parse each level with a regular regex. We simply print
//! token names and match the grammar patterns against them.
//!
//! When tokens are matched, we create intermediate representation nodes, which carry
//! only two bits of information: the node matched and which tokens it uses.
//!
//! This allows us to separate the semantic analysis from the ast building. This is
//! a good thing overall, but was instrumental during development, as we ran multiple
//! parsers in parallel and the ast building had to be unified (correct parsing would
//! result in the same node types + tokens).
//!
//! AST Building (5.3):
//! From the IR nodes, we build the actual AST nodes. During this step, important
//! things happen:
//! 1. We unroll source tokens so that ast nodes have access to token values.
//! 2. The location from tokens is used to calculate the location for the ast node.
//! 3. The location is transformed from byte range to a dual byte range + line:column
//! position.
//! At this stage we create the root session node; it will be attached to the
//! [`Document`] during assembling.
//!
//! Inline Parsing (5.4):
//! Before assembling the document (while annotations are still part of the content
//! tree), we parse the TextContent nodes for inline elements. This parsing is much
//! simpler, as it has formal start/end tokens and has no structural elements.
//!
//! Document Assembly (5.5):
//! The assembling stage wraps the root session into a document node and performs
//! metadata attachment. Annotations, which are metadata, are always attached to AST
//! nodes, so they can be very targeted. Only with the full document in place we can
//! attach annotations to their correct target nodes. This is harder than it seems.
//! Keeping Lex ethos of not enforcing structure, this needs to deal with several
//! ambiguous cases, including some complex logic for calculating "human
//! understanding" distance between elements.
//!
//! Terminology
//!
//! - parse: Colloquial term for the entire process (lexing + analysis + building)
//! - analyze/analysis: The syntactic analysis phase specifically
//! - build: The AST construction phase specifically
//!
//! Testing
//!
//! All parser tests must follow strict guidelines. See the [testing module](crate::lex::testing)
//! for comprehensive documentation on using verified lex sources and AST assertions.
// Parser implementations
// Re-export common parser interfaces
pub use ;
// Re-export AST types and utilities from the ast module
pub use crate;
pub use crate;
/// Type alias for processing results returned by helper APIs.
type ProcessResult = ;
/// Process source text through the complete pipeline: lex, analyze, and build.
///
/// "Parse" here is colloquial — the name covers the whole pipeline
/// (lexing + analysis + building), not just the syntactic phase.
///
/// This is the primary entry point for processing lex documents. It performs:
/// 1. Lexing: Tokenizes the source text
/// 2. Analysis: Performs syntactic analysis to produce IR nodes
/// 3. Building: Constructs the root session tree from IR nodes (assembling wraps it in a
/// `Document` and attaches metadata)
///
/// # Arguments
///
/// * `source` - The source text to process
///
/// # Returns
///
/// A `Document` containing the complete AST, or parsing errors.
///
/// # Example
///
/// ```rust,ignore
/// use lex_core::lex::parsing::parse_document;
///
/// let source = "Hello world\n";
/// let document = parse_document(source)?;
/// ```
/// Same as [`parse_document`] but runs `NormalizeLabels` in permissive
/// mode so labels that strict mode would reject (`doc.*`,
/// unrecognised `lex.*`) flow through into the AST instead of failing
/// the parse. Intended for hosts that want to surface label-policy
/// violations as in-place diagnostics rather than as a parse failure
/// — `lex-lsp` is the primary consumer; PR 4 of #584 added the entry
/// point so the analysis stage can emit a diagnostic on the offending
/// label while the rest of the document keeps providing semantic
/// tokens, hover, completion, etc. Re-classify with
/// [`crate::lex::assembling::stages::normalize_labels::classify_label`]
/// to determine which sites would have errored in strict mode.
///
/// Routes through [`crate::lex::transforms::standard::run_string_to_ast`]
/// with `Mode::Permissive` so strict + permissive parses share a
/// single pipeline definition — no risk of the LSP parsing
/// differently from `lexd format` when the pipeline grows new stages.