lex_core/lex/
testing.rs

1//! Testing utilities for AST assertions
2//!
3//!     This module provides comprehensive testing tools and guidelines for the lex parser.
4//!     Testing the parser must follow strict rules to ensure reliability and maintainability.
5//!
6//! Why Testing is Different
7//!
8//!     Lex is a novel format, for which there is no established body of source text nor a
9//!     reference parser to compare against. Adding insult to injury, the format is still
10//!     evolving, so specs change, and in some ways it looks like markdown just enough to
11//!     create confusion.
12//!
13//!     The corollary here being that getting correct Lex source text is not trivial, and if
14//!     you make one up, the odds of it being slightly off are high. If one tests the parser
15//!     against an illegal source string, all goes to waste: we will have a parser tuned to
16//!     the wrong thing. Worst of all, as each test might produce its slight variation, we
17//!     will have an unpredictable, complex and wrong parser. If that was not enough, come a
18//!     change in the spec, and now we must hunt down and review hundreds of ad-hoc strings
19//!     in test files.
20//!
21//!     This is why all testing must follow two strict rules:
22//!
23//!         1. Always use verified sample files from the spec (via [Lexplore](lexplore))
24//!         2. Always use comprehensive AST assertions (via [assert_ast](fn@assert_ast))
25//!
26//! Rule 1: Always Use Lexplore for Test Content
27//!
28//!     Why this matters:
29//!
30//!         lex is a novel format that's still evolving. People regularly get small details
31//!         wrong, leading to false positives in tests. When lex changes, we need to verify
32//!         and update all source files. If lex content is scattered across many test files,
33//!         this becomes a maintenance nightmare.
34//!
35//!     The solution:
36//!
37//!         Use the `Lexplore` library to access verified, curated lex sample files. This
38//!         ensures only vetted sources are used and makes writing tests much easier.
39//!
40//!     Examples:
41//!
42//!     ```rust,ignore
43//!     use crate::lex::testing::lexplore::Lexplore;
44//!     use crate::lex::parsing::parse_document;
45//!
46//!     // CORRECT: Use verified sample files
47//!     let doc = Lexplore::paragraph(1).parse().unwrap();
48//!     let paragraph = doc.root.expect_paragraph();
49//!
50//!     // OR load source and parse separately
51//!     let source = Lexplore::paragraph(1).source();
52//!     let doc = parse_document(&source).unwrap();
53//!
54//!     // OR use tokenization
55//!     let tokens = Lexplore::list(1).tokenize().unwrap();
56//!
57//!     // OR load documents (benchmark, trifecta)
58//!     let doc = Lexplore::benchmark(10).parse().unwrap();
59//!     let doc = Lexplore::trifecta(0).parse().unwrap();
60//!
61//!     // OR get the AST node directly
62//!     let paragraph = Lexplore::get_paragraph(1);
63//!     let list = Lexplore::get_list(1);
64//!     let session = Lexplore::get_session(1);
65//!
66//!     // WRONG: Don't write lex content directly in tests
67//!     let doc = parse_document("Some paragraph\n\nAnother paragraph\n\n").unwrap();
68//!     ```
69//!
70//!     Available sources:
71//!
72//!         - Elements: `Lexplore::paragraph(1)`, `Lexplore::list(1)`, etc. - Individual elements
73//!         - Documents: `Lexplore::benchmark(0)`, `Lexplore::trifecta(0)` - Full documents
74//!         - Direct access: `Lexplore::get_paragraph(1)` - Returns the AST node directly
75//!
76//!     The sample files are organized:
77//!
78//!         - By elements:
79//!             - Isolated elements (only the element itself): Individual test cases
80//!             - In Document: mixed with other elements: Integration test cases
81//!         - Benchmark: full documents that are used to test the parser
82//!         - Trifecta: a mix of sessions, paragraphs and lists, the structural elements
83//!
84//!     See the [Lexplore documentation](lexplore) for complete API details.
85//!
86//! Rule 2: Always Use assert_ast for AST Verification
87//!
88//! Why this matters:
89//!
90//! What we want for every document test is to ensure that the AST shape is correct
91//! per the grammar, that all attributes are correct (children, content, etc.).
92//! Asserting generalities like node counts is useless - it's not informative.
93//! We want assurance on the AST shape and content.
94//!
95//! This is also very hard to write, time-consuming, and when the lex spec changes,
96//! very hard to update.
97//!
98//! The solution:
99//!
100//! Use the `assert_ast` library with its fluent API. It allows testing entire
101//! hierarchies of nodes at once with 10-20x less code.
102//!
103//! ### The Problem with Manual Testing
104//!
105//! Testing a nested session traditionally looks like this:
106//!
107//! ```rust-example
108//! use crate::lex::ast::ContentItem;
109//!
110//! match &doc.content[0] {
111//!     ContentItem::Session(s) => {
112//!         assert_eq!(s.title, "Introduction");
113//!         assert_eq!(s.children.len(), 2);
114//!         match &s.content[0] {
115//!             ContentItem::Paragraph(p) => {
116//!                 assert_eq!(p.lines.len(), 1);
117//!                 assert!(p.lines[0].starts_with("Hello"));
118//!             }
119//!             _ => panic!("Expected paragraph"),
120//!         }
121//!         // ... repeat for second child
122//!     }
123//!     _ => panic!("Expected session"),
124//! }
125//! ```
126//!
127//! 20+ lines of boilerplate. Hard to see what's actually being tested.
128
129//! ### The Solution: Fluent Assertion API
130
131//! With the `assert_ast` fluent API, the same test becomes:
132
133//! ```rust-example
134//! use crate::lex::testing::assert_ast;
135//!
136//! assert_ast(&doc)
137//!     .item(0, |item| {
138//!         item.assert_session()
139//!             .label("Introduction")
140//!             .child_count(2)
141//!             .child(0, |child| {
142//!                 child.assert_paragraph()
143//!                     .text_starts_with("Hello")
144//!             })
145//!     });
146//! ```
147
148//! Concise, readable, and maintainable.
149
150//! ## Available Node Types
151
152//! The assertion API supports all AST node types:
153//! - `ParagraphAssertion` - Text content nodes
154//! - `SessionAssertion` - Titled container nodes  
155//! - `ListAssertion` / `ListItemAssertion` - List structures
156//! - `DefinitionAssertion` - Subject-definition pairs
157//! - `AnnotationAssertion` - Metadata with parameters
158//! - `VerbatimBlockkAssertion` - Raw content blocks
159
160//!   Each assertion type provides type-specific methods (e.g., `label()` for
161//!   sessions, `subject()` for definitions, `parameter_count()` for annotations).
162
163//! ## Extending the Assertion API
164
165//! To add support for a new container node type:
166//!
167//! 1. Implement the traits in `ast.rs`:
168//!    ```rust-example
169//!    use crate::lex::ast::{Container, ContentItem};
170//!
171//!    struct NewNode { content: Vec<ContentItem>, label: String }
172//!
173//!    impl Container for NewNode {
174//!        fn label(&self) -> &str { &self.label }
175//!        fn children(&self) -> &[ContentItem] { &self.content }
176//!        fn children_mut(&mut self) -> &mut Vec<ContentItem> { &mut self.content }
177//!    }
178//!    ```
179//!
180//! 2. Add to ContentItem enum and implement helper methods
181//!
182//! 3. Add assertion type in `testing_assertions.rs`:
183//!    ```rust-example
184//!    pub struct NewNodeAssertion<'a> { /* ... */ }
185//!
186//!    impl NewNodeAssertion<'_> {
187//!        pub fn custom_field(self, expected: &str) -> Self { /* ... */ }
188//!        pub fn child_count(self, expected: usize) -> Self { /* ... */ }
189//!    }
190//!    ```
191//!
192//! 4. Add to ContentItemAssertion and export in `testing.rs`:
193//!    ```rust-example
194//!    pub fn assert_new_node(self) -> NewNodeAssertion<'a> { /* ... */ }
195//!    ```
196
197mod ast_assertions;
198pub mod lexplore;
199mod matchers;
200pub mod text_diff;
201
202pub use ast_assertions::{
203    assert_ast, AnnotationAssertion, ChildrenAssertion, ContentItemAssertion, DefinitionAssertion,
204    DocumentAssertion, InlineAssertion, InlineExpectation, ListAssertion, ListItemAssertion,
205    ParagraphAssertion, ReferenceExpectation, SessionAssertion, VerbatimBlockkAssertion,
206};
207pub use matchers::TextMatch;
208
209// Public submodule path: crate::lex::testing::factories
210pub mod factories {
211    pub use crate::lex::token::testing::*;
212}
213
214/// Get a path relative to the crate root for testing purposes.
215///
216/// `CARGO_MANIFEST_DIR` points to the crate directory where specs/ lives.
217///
218/// # Example
219/// ```rust,ignore
220/// let path = workspace_path("comms/specs/elements/paragraph.docs/paragraph-01-flat-oneline.lex");
221/// let content = std::fs::read_to_string(path).unwrap();
222/// ```
223pub fn workspace_path(relative_path: &str) -> std::path::PathBuf {
224    let manifest_dir = env!("CARGO_MANIFEST_DIR");
225    std::path::Path::new(manifest_dir).join(relative_path)
226}
227
228/// Parse a Lex document without running the annotation attachment stage.
229///
230/// This is useful for tests that need annotations to remain in the content tree
231/// rather than being attached as metadata. Common use cases:
232/// - Testing annotation parsing in isolation
233/// - Testing the attachment logic itself
234/// - Element tests that expect annotations as content items
235///
236/// # Example
237/// ```rust,ignore
238/// use crate::lex::testing::parse_without_annotation_attachment;
239///
240/// let source = ":: note ::\nSome paragraph\n";
241/// let doc = parse_without_annotation_attachment(source).unwrap();
242///
243/// // Annotation is still in content tree, not attached as metadata
244/// assert!(doc.root.children.iter().any(|item| matches!(item, ContentItem::Annotation(_))));
245/// ```
246pub fn parse_without_annotation_attachment(
247    source: &str,
248) -> Result<crate::lex::ast::Document, String> {
249    use crate::lex::assembling::AttachRoot;
250    use crate::lex::parsing::engine::parse_from_flat_tokens;
251    use crate::lex::transforms::stages::ParseInlines;
252    use crate::lex::transforms::standard::LEXING;
253    use crate::lex::transforms::Runnable;
254
255    let source = if !source.is_empty() && !source.ends_with('\n') {
256        format!("{source}\n")
257    } else {
258        source.to_string()
259    };
260    let tokens = LEXING.run(source.clone()).map_err(|e| e.to_string())?;
261    let root = parse_from_flat_tokens(tokens, &source).map_err(|e| e.to_string())?;
262    let root = ParseInlines::new().run(root).map_err(|e| e.to_string())?;
263    // Assemble the root session into a Document but skip metadata attachment
264    AttachRoot::new().run(root).map_err(|e| e.to_string())
265}
lex_core/lex/testing.rs

lex_core/lex/
testing.rs