lex_core/lex/ast/elements/verbatim.rs
1//! Verbatim block element
2//!
3//! A verbatim block embeds content that is not lex formatted. This can be any binary
4//! encoded data, such as images or videos or text in another formal language, most
5//! commonly programming language's code. Since the whole point of the element is to say:
6//! hands off, do not parse this, just preserve it, you'd think that it would be a simple
7//! element, but in reality this is by far the most complex element in Lex, and it warrants
8//! some explanation.
9//!
10//! Note that a verbatim block can forgo content all together (i.e. binaries won't encode
11//! content).
12//!
13//! Structure
14//!
15//! - subject: The lead item identifying what the verbatim block contains
16//! - children: VerbatimLine nodes containing the actual content (can be empty)
17//! - closing_data: The closing marker (format: `:: label params? ::`)
18//!
19//! The subject introduces what the content is, and the closing data terminates the block.
20//! The data node carries the label/parameters describing the payload. As a convention
21//! though, if the content is to be interpreted by a tool, the label should be the name
22//! of the tool/language. While the lex software will not parse the content, it will
23//! preserve it exactly as it is, and can be used to format the content in editors and
24//! other tools.
25//!
26//! Syntax
27//!
28//! <subject-line>
29//! <indent> <content> ... any number of content elements
30//! <dedent> <data>
31//!
32//! Parsing Structure:
33//!
34//! | Element | Prec. Blank | Head | Blank | Content | Tail |
35//! |----------|-------------|-------------|----------|----------|-----------------|
36//! | Verbatim | Optional | SubjectLine | Optional | Optional | dedent+DataLine |
37//!
38//! Parsing Verbatim Blocks
39//!
40//! The first point is that, since it can hold non Lex content, its content can't be
41//! parsed. It can be lexed without prejudice, but not parsed. Not only would it be
42//! gibberish, but worse, in case it would trigger indent and dedent events, it would
43//! throw off the parsing and break the document.
44//!
45//! This has two consequences: that verbatim parsing must come first, lest its content
46//! create havoc on the structure and also that identifying its end marker has to be very
47//! easy. That's the reason why it ends in a data node, which is the only form that is
48//! not common on regular text.
49//!
50//! The verbatim parsing is the only stateful parsing in the pipeline. It matches a
51//! subject line, then either an indented container (in-flow) or flat lines
52//! (full-width/groups), and requires the closing annotation at the same indentation as
53//! the subject.
54//!
55//! Verbatim blocks are tried first in the grammar pattern matching order, before any
56//! other elements. This ensures that their non-lex content doesn't interfere with the
57//! parsing of the rest of the document.
58//!
59//! Content and the Indentation Wall
60//!
61//! Verbatim content can be pretty much anything, and that includes any space characters,
62//! which we must not interpret as indentation, nor discard, as it's content. The way to
63//! think about this is through the indentation wall.
64//!
65//! In-Flow Mode
66//!
67//! In this mode, called In-flow Mode, the verbatim content is indented just like any
68//! other children content in Lex, +1 from their parent.
69//!
70//! Verbatim content starts at the wall (the subject's indentation + 1 level), until the
71//! end of line. Whitespace characters should be preserved as content. Content cannot,
72//! however, start before the wall, lest we had no way to determine the end of the block.
73//!
74//! This logic allows for a neat trick: that verbatim blocks do not need to quote any
75//! content. Even if a line looks like a data node, the fact that it's not in the same
76//! level as the subject means it's not the block's end marker.
77//!
78//! Example:
79//! I'm A verbatim Block Subject:
80//! |<- this is the indentation wall, that is the subject's + 1 level up
81//! I'm the first content line
82//! But content can be indented however I please
83//! error ->| as long as it's past the wall
84//! :: text ::
85//!
86//! Full-Width Mode
87//!
88//! At times, verbatim content is very wide, as in tables. In these cases, the various
89//! indentation levels in the Lex document can consume valuable space which would throw
90//! off the content making it either hard to read or truncated by some tools.
91//!
92//! For these cases, the full-width mode allows the content to take (almost) all columns.
93//! In this mode, the wall is at user-facing column 2 (zero-based column 1), so content
94//! can hug the left margin without looking like a closing annotation.
95//!
96//! Example:
97//! Here is the content.
98//! |<- this is the wall
99//!
100//! :: lex ::
101//!
102//! The block's mode is determined by the position of the first non-whitespace character
103//! of the first content line. If it's at user-facing column 2, it's a full-width mode
104//! block; otherwise it's in-flow.
105//!
106//! The reason for column 2: column 1 would be indistinguishable from the subject's
107//! indentation, while a full indent would lose horizontal space. Column 2 preserves
108//! visual separation without looking like an error.
109//!
110//! Verbatim Groups
111//!
112//! Verbatim blocks support multiple subject/content pairs sharing a single closing
113//! annotation. Use the `group()` iterator to access all pairs. See the spec for syntax
114//! and examples.
115//!
116//! This special casing rule allows multiple subject + content groups with only 1 closing
117//! annotation marker.
118//!
119//! Learn More:
120//!
121//! - Verbatim blocks spec: specs/v1/elements/verbatim.lex
122//!
123
124use super::super::range::{Position, Range};
125use super::super::text_content::TextContent;
126use super::super::traits::{AstNode, Container, Visitor, VisualStructure};
127use super::annotation::Annotation;
128use super::container::VerbatimContainer;
129use super::content_item::ContentItem;
130use super::data::Data;
131use super::typed_content::VerbatimContent;
132use std::fmt;
133use std::slice;
134
135/// Represents the mode of a verbatim block.
136#[derive(Debug, Clone, Copy, PartialEq, Eq)]
137pub enum VerbatimBlockMode {
138 /// The block's content is indented relative to the subject line.
139 Inflow,
140 /// The block's content starts at a fixed, absolute column.
141 Fullwidth,
142}
143
144/// A verbatim block represents content from another format/system.
145#[derive(Debug, Clone, PartialEq)]
146pub struct Verbatim {
147 /// Subject line of the first group (backwards-compatible direct access)
148 pub subject: TextContent,
149 /// Content lines of the first group (backwards-compatible direct access)
150 pub children: VerbatimContainer,
151 /// Closing data shared by all groups
152 pub closing_data: Data,
153 /// Annotations attached to this verbatim block
154 pub annotations: Vec<Annotation>,
155 /// Location spanning all groups and the closing data
156 pub location: Range,
157 /// The rendering mode of the verbatim block.
158 pub mode: VerbatimBlockMode,
159 /// Additional subject/content pairs beyond the first (for multi-group verbatims)
160 additional_groups: Vec<VerbatimGroupItem>,
161}
162
163impl Verbatim {
164 fn default_location() -> Range {
165 Range::new(0..0, Position::new(0, 0), Position::new(0, 0))
166 }
167
168 pub fn new(
169 subject: TextContent,
170 children: Vec<VerbatimContent>,
171 closing_data: Data,
172 mode: VerbatimBlockMode,
173 ) -> Self {
174 Self {
175 subject,
176 children: VerbatimContainer::from_typed(children),
177 closing_data,
178 annotations: Vec::new(),
179 location: Self::default_location(),
180 mode,
181 additional_groups: Vec::new(),
182 }
183 }
184
185 pub fn with_subject(subject: String, closing_data: Data) -> Self {
186 Self {
187 subject: TextContent::from_string(subject, None),
188 children: VerbatimContainer::empty(),
189 closing_data,
190 annotations: Vec::new(),
191 location: Self::default_location(),
192 mode: VerbatimBlockMode::Inflow,
193 additional_groups: Vec::new(),
194 }
195 }
196
197 pub fn marker(subject: String, closing_data: Data) -> Self {
198 Self {
199 subject: TextContent::from_string(subject, None),
200 children: VerbatimContainer::empty(),
201 closing_data,
202 annotations: Vec::new(),
203 location: Self::default_location(),
204 mode: VerbatimBlockMode::Inflow,
205 additional_groups: Vec::new(),
206 }
207 }
208
209 /// Preferred builder
210 pub fn at(mut self, location: Range) -> Self {
211 self.location = location;
212 self
213 }
214
215 /// Attach additional verbatim group entries beyond the first pair.
216 pub fn with_additional_groups(mut self, groups: Vec<VerbatimGroupItem>) -> Self {
217 self.additional_groups = groups;
218 self
219 }
220
221 /// Mutable access to the additional verbatim groups beyond the first.
222 pub fn additional_groups_mut(&mut self) -> std::slice::IterMut<'_, VerbatimGroupItem> {
223 self.additional_groups.iter_mut()
224 }
225
226 /// Annotations attached to this verbatim block.
227 pub fn annotations(&self) -> &[Annotation] {
228 &self.annotations
229 }
230
231 /// Mutable access to verbatim annotations.
232 pub fn annotations_mut(&mut self) -> &mut Vec<Annotation> {
233 &mut self.annotations
234 }
235
236 /// Iterate annotation blocks attached to this verbatim block.
237 pub fn iter_annotations(&self) -> std::slice::Iter<'_, Annotation> {
238 self.annotations.iter()
239 }
240
241 /// Iterate all content items nested inside verbatim annotations.
242 pub fn iter_annotation_contents(&self) -> impl Iterator<Item = &ContentItem> {
243 self.annotations
244 .iter()
245 .flat_map(|annotation| annotation.children())
246 }
247
248 /// Returns an iterator over each subject/content pair in the group order.
249 pub fn group(&self) -> VerbatimGroupIter<'_> {
250 VerbatimGroupIter {
251 first_yielded: false,
252 verbatim: self,
253 rest: self.additional_groups.iter(),
254 }
255 }
256
257 /// Returns the number of subject/content pairs held by this verbatim block.
258 pub fn group_len(&self) -> usize {
259 1 + self.additional_groups.len()
260 }
261}
262
263impl AstNode for Verbatim {
264 fn node_type(&self) -> &'static str {
265 "VerbatimBlock"
266 }
267 fn display_label(&self) -> String {
268 let subject_text = self.subject.as_string();
269 if subject_text.chars().count() > 50 {
270 format!("{}…", subject_text.chars().take(50).collect::<String>())
271 } else {
272 subject_text.to_string()
273 }
274 }
275 fn range(&self) -> &Range {
276 &self.location
277 }
278
279 fn accept(&self, visitor: &mut dyn Visitor) {
280 visitor.visit_verbatim_block(self);
281 // Visit all groups, not just the first
282 for group in self.group() {
283 visitor.visit_verbatim_group(&group);
284 super::super::traits::visit_children(visitor, group.children);
285 visitor.leave_verbatim_group(&group);
286 }
287 visitor.leave_verbatim_block(self);
288 }
289}
290
291impl VisualStructure for Verbatim {
292 fn is_source_line_node(&self) -> bool {
293 true
294 }
295
296 fn has_visual_header(&self) -> bool {
297 true
298 }
299}
300
301impl Container for Verbatim {
302 fn label(&self) -> &str {
303 self.subject.as_string()
304 }
305
306 fn children(&self) -> &[ContentItem] {
307 &self.children
308 }
309
310 fn children_mut(&mut self) -> &mut Vec<ContentItem> {
311 self.children.as_mut_vec()
312 }
313}
314
315impl fmt::Display for Verbatim {
316 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
317 let group_count = self.group_len();
318 let group_word = if group_count == 1 { "group" } else { "groups" };
319 write!(
320 f,
321 "VerbatimBlock('{}', {} {}, closing: {})",
322 self.subject.as_string(),
323 group_count,
324 group_word,
325 self.closing_data.label.value
326 )
327 }
328}
329
330/// Stored representation of additional verbatim group entries
331#[derive(Debug, Clone, PartialEq)]
332pub struct VerbatimGroupItem {
333 pub subject: TextContent,
334 pub children: VerbatimContainer,
335}
336
337impl VerbatimGroupItem {
338 pub fn new(subject: TextContent, children: Vec<VerbatimContent>) -> Self {
339 Self {
340 subject,
341 children: VerbatimContainer::from_typed(children),
342 }
343 }
344}
345
346/// Immutable view over a verbatim group entry.
347#[derive(Debug, Clone)]
348pub struct VerbatimGroupItemRef<'a> {
349 pub subject: &'a TextContent,
350 pub children: &'a VerbatimContainer,
351}
352
353/// Iterator over all subject/content pairs inside a verbatim block.
354pub struct VerbatimGroupIter<'a> {
355 first_yielded: bool,
356 verbatim: &'a Verbatim,
357 rest: slice::Iter<'a, VerbatimGroupItem>,
358}
359
360impl<'a> Iterator for VerbatimGroupIter<'a> {
361 type Item = VerbatimGroupItemRef<'a>;
362
363 fn next(&mut self) -> Option<Self::Item> {
364 if !self.first_yielded {
365 self.first_yielded = true;
366 return Some(VerbatimGroupItemRef {
367 subject: &self.verbatim.subject,
368 children: &self.verbatim.children,
369 });
370 }
371
372 self.rest.next().map(|item| VerbatimGroupItemRef {
373 subject: &item.subject,
374 children: &item.children,
375 })
376 }
377}