lex_core/lex/testing.rs
1//! Testing utilities for AST assertions
2//!
3//! This module provides comprehensive testing tools and guidelines for the lex parser.
4//! Testing the parser must follow strict rules to ensure reliability and maintainability.
5//!
6//! Why Testing is Different
7//!
8//! Lex is a novel format, for which there is no established body of source text nor a
9//! reference parser to compare against. Adding insult to injury, the format is still
10//! evolving, so specs change, and in some ways it looks like markdown just enough to
11//! create confusion.
12//!
13//! The corollary here being that getting correct Lex source text is not trivial, and if
14//! you make one up, the odds of it being slightly off are high. If one tests the parser
15//! against an illegal source string, all goes to waste: we will have a parser tuned to
16//! the wrong thing. Worst of all, as each test might produce its slight variation, we
17//! will have an unpredictable, complex and wrong parser. If that was not enough, come a
18//! change in the spec, and now we must hunt down and review hundreds of ad-hoc strings
19//! in test files.
20//!
21//! This is why all testing must follow two strict rules:
22//!
23//! 1. Always use verified sample files from the spec (via [Lexplore](lexplore))
24//! 2. Always use comprehensive AST assertions (via [assert_ast](fn@assert_ast))
25//!
26//! Rule 1: Always Use Lexplore for Test Content
27//!
28//! Why this matters:
29//!
30//! lex is a novel format that's still evolving. People regularly get small details
31//! wrong, leading to false positives in tests. When lex changes, we need to verify
32//! and update all source files. If lex content is scattered across many test files,
33//! this becomes a maintenance nightmare.
34//!
35//! The solution:
36//!
37//! Use the `Lexplore` library to access verified, curated lex sample files. This
38//! ensures only vetted sources are used and makes writing tests much easier.
39//!
40//! Examples:
41//!
42//! ```rust,ignore
43//! use crate::lex::testing::lexplore::Lexplore;
44//! use crate::lex::parsing::parse_document;
45//!
46//! // CORRECT: Use verified sample files
47//! let doc = Lexplore::paragraph(1).parse().unwrap();
48//! let paragraph = doc.root.expect_paragraph();
49//!
50//! // OR load source and parse separately
51//! let source = Lexplore::paragraph(1).source();
52//! let doc = parse_document(&source).unwrap();
53//!
54//! // OR use tokenization
55//! let tokens = Lexplore::list(1).tokenize().unwrap();
56//!
57//! // OR load documents (benchmark, trifecta)
58//! let doc = Lexplore::benchmark(10).parse().unwrap();
59//! let doc = Lexplore::trifecta(0).parse().unwrap();
60//!
61//! // OR get the AST node directly
62//! let paragraph = Lexplore::get_paragraph(1);
63//! let list = Lexplore::get_list(1);
64//! let session = Lexplore::get_session(1);
65//!
66//! // WRONG: Don't write lex content directly in tests
67//! let doc = parse_document("Some paragraph\n\nAnother paragraph\n\n").unwrap();
68//! ```
69//!
70//! Available sources:
71//!
72//! - Elements: `Lexplore::paragraph(1)`, `Lexplore::list(1)`, etc. - Individual elements
73//! - Documents: `Lexplore::benchmark(0)`, `Lexplore::trifecta(0)` - Full documents
74//! - Direct access: `Lexplore::get_paragraph(1)` - Returns the AST node directly
75//!
76//! The sample files are organized:
77//!
78//! - By elements:
79//! - Isolated elements (only the element itself): Individual test cases
80//! - In Document: mixed with other elements: Integration test cases
81//! - Benchmark: full documents that are used to test the parser
82//! - Trifecta: a mix of sessions, paragraphs and lists, the structural elements
83//!
84//! See the [Lexplore documentation](lexplore) for complete API details.
85//!
86//! Rule 2: Always Use assert_ast for AST Verification
87//!
88//! Why this matters:
89//!
90//! What we want for every document test is to ensure that the AST shape is correct
91//! per the grammar, that all attributes are correct (children, content, etc.).
92//! Asserting generalities like node counts is useless - it's not informative.
93//! We want assurance on the AST shape and content.
94//!
95//! This is also very hard to write, time-consuming, and when the lex spec changes,
96//! very hard to update.
97//!
98//! The solution:
99//!
100//! Use the `assert_ast` library with its fluent API. It allows testing entire
101//! hierarchies of nodes at once with 10-20x less code.
102//!
103//! ### The Problem with Manual Testing
104//!
105//! Testing a nested session traditionally looks like this:
106//!
107//! ```rust-example
108//! use crate::lex::ast::ContentItem;
109//!
110//! match &doc.content[0] {
111//! ContentItem::Session(s) => {
112//! assert_eq!(s.title, "Introduction");
113//! assert_eq!(s.children.len(), 2);
114//! match &s.content[0] {
115//! ContentItem::Paragraph(p) => {
116//! assert_eq!(p.lines.len(), 1);
117//! assert!(p.lines[0].starts_with("Hello"));
118//! }
119//! _ => panic!("Expected paragraph"),
120//! }
121//! // ... repeat for second child
122//! }
123//! _ => panic!("Expected session"),
124//! }
125//! ```
126//!
127//! 20+ lines of boilerplate. Hard to see what's actually being tested.
128
129//! ### The Solution: Fluent Assertion API
130
131//! With the `assert_ast` fluent API, the same test becomes:
132
133//! ```rust-example
134//! use crate::lex::testing::assert_ast;
135//!
136//! assert_ast(&doc)
137//! .item(0, |item| {
138//! item.assert_session()
139//! .label("Introduction")
140//! .child_count(2)
141//! .child(0, |child| {
142//! child.assert_paragraph()
143//! .text_starts_with("Hello")
144//! })
145//! });
146//! ```
147
148//! Concise, readable, and maintainable.
149
150//! ## Available Node Types
151
152//! The assertion API supports all AST node types:
153//! - `ParagraphAssertion` - Text content nodes
154//! - `SessionAssertion` - Titled container nodes
155//! - `ListAssertion` / `ListItemAssertion` - List structures
156//! - `DefinitionAssertion` - Subject-definition pairs
157//! - `AnnotationAssertion` - Metadata with parameters
158//! - `VerbatimBlockkAssertion` - Raw content blocks
159
160//! Each assertion type provides type-specific methods (e.g., `label()` for
161//! sessions, `subject()` for definitions, `parameter_count()` for annotations).
162
163//! ## Extending the Assertion API
164
165//! To add support for a new container node type:
166//!
167//! 1. Implement the traits in `ast.rs`:
168//! ```rust-example
169//! use crate::lex::ast::{Container, ContentItem};
170//!
171//! struct NewNode { content: Vec<ContentItem>, label: String }
172//!
173//! impl Container for NewNode {
174//! fn label(&self) -> &str { &self.label }
175//! fn children(&self) -> &[ContentItem] { &self.content }
176//! fn children_mut(&mut self) -> &mut Vec<ContentItem> { &mut self.content }
177//! }
178//! ```
179//!
180//! 2. Add to ContentItem enum and implement helper methods
181//!
182//! 3. Add assertion type in `testing_assertions.rs`:
183//! ```rust-example
184//! pub struct NewNodeAssertion<'a> { /* ... */ }
185//!
186//! impl NewNodeAssertion<'_> {
187//! pub fn custom_field(self, expected: &str) -> Self { /* ... */ }
188//! pub fn child_count(self, expected: usize) -> Self { /* ... */ }
189//! }
190//! ```
191//!
192//! 4. Add to ContentItemAssertion and export in `testing.rs`:
193//! ```rust-example
194//! pub fn assert_new_node(self) -> NewNodeAssertion<'a> { /* ... */ }
195//! ```
196
197mod ast_assertions;
198pub mod lexplore;
199mod matchers;
200pub mod text_diff;
201
202pub use ast_assertions::{
203 assert_ast, AnnotationAssertion, ChildrenAssertion, ContentItemAssertion, DefinitionAssertion,
204 DocumentAssertion, InlineAssertion, InlineExpectation, ListAssertion, ListItemAssertion,
205 ParagraphAssertion, ReferenceExpectation, SessionAssertion, VerbatimBlockkAssertion,
206};
207pub use matchers::TextMatch;
208
209// Public submodule path: crate::lex::testing::factories
210pub mod factories {
211 pub use crate::lex::token::testing::*;
212}
213
214/// Get a path relative to the crate root for testing purposes.
215///
216/// `CARGO_MANIFEST_DIR` points to the crate directory where specs/ lives.
217///
218/// # Example
219/// ```rust,ignore
220/// let path = workspace_path("comms/specs/elements/paragraph.docs/paragraph-01-flat-oneline.lex");
221/// let content = std::fs::read_to_string(path).unwrap();
222/// ```
223pub fn workspace_path(relative_path: &str) -> std::path::PathBuf {
224 let manifest_dir = env!("CARGO_MANIFEST_DIR");
225 std::path::Path::new(manifest_dir).join(relative_path)
226}
227
228/// Parse a Lex document without running the annotation attachment stage.
229///
230/// This is useful for tests that need annotations to remain in the content tree
231/// rather than being attached as metadata. Common use cases:
232/// - Testing annotation parsing in isolation
233/// - Testing the attachment logic itself
234/// - Element tests that expect annotations as content items
235///
236/// # Example
237/// ```rust,ignore
238/// use crate::lex::testing::parse_without_annotation_attachment;
239///
240/// let source = ":: note ::\nSome paragraph\n";
241/// let doc = parse_without_annotation_attachment(source).unwrap();
242///
243/// // Annotation is still in content tree, not attached as metadata
244/// assert!(doc.root.children.iter().any(|item| matches!(item, ContentItem::Annotation(_))));
245/// ```
246pub fn parse_without_annotation_attachment(
247 source: &str,
248) -> Result<crate::lex::ast::Document, String> {
249 use crate::lex::assembling::AttachRoot;
250 use crate::lex::parsing::engine::parse_from_flat_tokens;
251 use crate::lex::transforms::stages::ParseInlines;
252 use crate::lex::transforms::standard::LEXING;
253 use crate::lex::transforms::Runnable;
254
255 let source = if !source.is_empty() && !source.ends_with('\n') {
256 format!("{source}\n")
257 } else {
258 source.to_string()
259 };
260 let tokens = LEXING.run(source.clone()).map_err(|e| e.to_string())?;
261 let root = parse_from_flat_tokens(tokens, &source).map_err(|e| e.to_string())?;
262 let root = ParseInlines::new().run(root).map_err(|e| e.to_string())?;
263 // Assemble the root session into a Document but skip metadata attachment
264 AttachRoot::new().run(root).map_err(|e| e.to_string())
265}