1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
//! Parsing module for the lex format
//!
//! This module provides the complete processing pipeline from source text to AST:
//! 1. Lexing: Tokenization of source text. See [lexing](crate::lex::lexing) module.
//! 2. Analysis: Syntactic analysis to produce IR nodes. See [engine](engine) module.
//! 3. Building: Construction of AST from IR nodes. See [building](crate::lex::building) module.
//! 4. Inline Parsing: Parse inline elements in text content. See [inlines](crate::lex::inlines) module.
//! 5. Assembling: Post-parsing transformations. See [assembling](crate::lex::assembling) module.
//!
//! Parsing End To End
//!
//! The complete pipeline transforms a string of Lex source up to the final AST through
//! these stages:
//!
//! Lexing (5.1):
//! Tokenization and transformations that group tokens into lines. At the end of
//! lexing, we have a TokenStream of Line tokens + indent/dedent tokens.
//!
//! Parsing - Semantic Analysis (5.2):
//! At the very beginning of parsing we will group line tokens into a tree of
//! LineContainers. What this gives us is the ability to parse each level in isolation.
//! Because we don't need to know what a LineContainer has, but only that it is a
//! line container, we can parse each level with a regular regex. We simply print
//! token names and match the grammar patterns against them.
//!
//! When tokens are matched, we create intermediate representation nodes, which carry
//! only two bits of information: the node matched and which tokens it uses.
//!
//! This allows us to separate the semantic analysis from the ast building. This is
//! a good thing overall, but was instrumental during development, as we ran multiple
//! parsers in parallel and the ast building had to be unified (correct parsing would
//! result in the same node types + tokens).
//!
//! AST Building (5.3):
//! From the IR nodes, we build the actual AST nodes. During this step, important
//! things happen:
//! 1. We unroll source tokens so that ast nodes have access to token values.
//! 2. The location from tokens is used to calculate the location for the ast node.
//! 3. The location is transformed from byte range to a dual byte range + line:column
//! position.
//! At this stage we create the root session node; it will be attached to the
//! [`Document`] during assembling.
//!
//! Inline Parsing (5.4):
//! Before assembling the document (while annotations are still part of the content
//! tree), we parse the TextContent nodes for inline elements. This parsing is much
//! simpler, as it has formal start/end tokens and has no structural elements.
//!
//! Document Assembly (5.5):
//! The assembling stage wraps the root session into a document node and performs
//! metadata attachment. Annotations, which are metadata, are always attached to AST
//! nodes, so they can be very targeted. Only with the full document in place we can
//! attach annotations to their correct target nodes. This is harder than it seems.
//! Keeping Lex ethos of not enforcing structure, this needs to deal with several
//! ambiguous cases, including some complex logic for calculating "human
//! understanding" distance between elements.
//!
//! Terminology
//!
//! - parse: Colloquial term for the entire process (lexing + analysis + building)
//! - analyze/analysis: The syntactic analysis phase specifically
//! - build: The AST construction phase specifically
//!
//! Testing
//!
//! All parser tests must follow strict guidelines. See the [testing module](crate::lex::testing)
//! for comprehensive documentation on using verified lex sources and AST assertions.
// Parser implementations
// Re-export common parser interfaces
pub use ;
// Re-export AST types and utilities from the ast module
pub use crate;
pub use crate;
/// Type alias for processing results returned by helper APIs.
type ProcessResult = ;
/// Process source text through the complete pipeline: lex, analyze, and build.
///
/// This is the primary entry point for processing lex documents. It performs:
/// 1. Lexing: Tokenizes the source text
/// 2. Analysis: Performs syntactic analysis to produce IR nodes
/// 3. Building: Constructs the root session tree from IR nodes (assembling wraps it in a
/// `Document` and attaches metadata)
///
/// # Arguments
///
/// * `source` - The source text to process
///
/// # Returns
///
/// A `Document` containing the complete AST, or parsing errors.
///
/// # Example
///
/// ```rust,ignore
/// use lex::lex::parsing::process_full;
///
/// let source = "Hello world\n";
/// let document = process_full(source)?;
/// ```
/// Alias for `process_full` to maintain backward compatibility.
///
/// The term "parse" colloquially refers to the entire processing pipeline
/// (lexing + analysis + building), even though technically parsing is just
/// the syntactic analysis phase.