Skip to main content

lex_core/lex/ast/elements/
verbatim.rs

1//! Verbatim block element
2//!
3//!     A verbatim block embeds content that is not lex formatted. This can be any binary
4//!     encoded data, such as images or videos or text in another formal language, most
5//!     commonly programming language's code. Since the whole point of the element is to say:
6//!     hands off, do not parse this, just preserve it, you'd think that it would be a simple
7//!     element, but in reality this is by far the most complex element in Lex, and it warrants
8//!     some explanation.
9//!
10//!     Note that a verbatim block can forgo content all together (i.e. binaries won't encode
11//!     content).
12//!
13//! Structure
14//!
15//!     - subject: The lead item identifying what the verbatim block contains
16//!     - children: VerbatimLine nodes containing the actual content (can be empty)
17//!     - closing_data: The closing marker (format: `:: label params? ::`)
18//!
19//!     The subject introduces what the content is, and the closing data terminates the block.
20//!     The data node carries the label/parameters describing the payload. As a convention
21//!     though, if the content is to be interpreted by a tool, the label should be the name
22//!     of the tool/language. While the lex software will not parse the content, it will
23//!     preserve it exactly as it is, and can be used to format the content in editors and
24//!     other tools.
25//!
26//! Syntax
27//!
28//!     <subject-line>
29//!     <indent> <content> ... any number of content elements
30//!     <dedent>  <data>
31//!
32//! Parsing Structure:
33//!
34//! | Element  | Prec. Blank | Head        | Blank    | Content  | Tail            |
35//! |----------|-------------|-------------|----------|----------|-----------------|
36//! | Verbatim | Optional    | SubjectLine | Optional | Optional | dedent+DataLine |
37//!
38//! Parsing Verbatim Blocks
39//!
40//!     The first point is that, since it can hold non Lex content, its content can't be
41//!     parsed. It can be lexed without prejudice, but not parsed. Not only would it be
42//!     gibberish, but worse, in case it would trigger indent and dedent events, it would
43//!     throw off the parsing and break the document.
44//!
45//!     This has two consequences: that verbatim parsing must come first, lest its content
46//!     create havoc on the structure and also that identifying its end marker has to be very
47//!     easy. That's the reason why it ends in a data node, which is the only form that is
48//!     not common on regular text.
49//!
50//!     The verbatim parsing is the only stateful parsing in the pipeline. It matches a
51//!     subject line, then either an indented container (in-flow) or flat lines
52//!     (full-width/groups), and requires the closing annotation at the same indentation as
53//!     the subject.
54//!
55//!     Verbatim blocks are tried first in the grammar pattern matching order, before any
56//!     other elements. This ensures that their non-lex content doesn't interfere with the
57//!     parsing of the rest of the document.
58//!
59//! Content and the Indentation Wall
60//!
61//!     Verbatim content can be pretty much anything, and that includes any space characters,
62//!     which we must not interpret as indentation, nor discard, as it's content. The way to
63//!     think about this is through the indentation wall.
64//!
65//! In-Flow Mode
66//!
67//!     In this mode, called In-flow Mode, the verbatim content is indented just like any
68//!     other children content in Lex, +1 from their parent.
69//!
70//!     Verbatim content starts at the wall (the subject's indentation + 1 level), until the
71//!     end of line. Whitespace characters should be preserved as content. Content cannot,
72//!     however, start before the wall, lest we had no way to determine the end of the block.
73//!
74//!     This logic allows for a neat trick: that verbatim blocks do not need to quote any
75//!     content. Even if a line looks like a data node, the fact that it's not in the same
76//!     level as the subject means it's not the block's end marker.
77//!
78//!     Example:
79//!         I'm A verbatim Block Subject:
80//!             |<- this is the indentation wall, that is the subject's + 1 level up
81//!             I'm the first content line
82//!             But content can be indented however I please
83//!     error ->| as long as it's past the wall
84//!             :: text ::
85//!
86//! Full-Width Mode
87//!
88//!     At times, verbatim content is very wide, as in tables. In these cases, the various
89//!     indentation levels in the Lex document can consume valuable space which would throw
90//!     off the content making it either hard to read or truncated by some tools.
91//!
92//!     For these cases, the full-width mode allows the content to take (almost) all columns.
93//!     In this mode, the wall is at user-facing column 2 (zero-based column 1), so content
94//!     can hug the left margin without looking like a closing annotation.
95//!
96//!     Example:
97//!   Here is the content.
98//!   |<- this is the wall
99//!
100//!             :: lex ::
101//!
102//!     The block's mode is determined by the position of the first non-whitespace character
103//!     of the first content line. If it's at user-facing column 2, it's a full-width mode
104//!     block; otherwise it's in-flow.
105//!
106//!     The reason for column 2: column 1 would be indistinguishable from the subject's
107//!     indentation, while a full indent would lose horizontal space. Column 2 preserves
108//!     visual separation without looking like an error.
109//!
110//! Verbatim Groups
111//!
112//!     Verbatim blocks support multiple subject/content pairs sharing a single closing
113//!     annotation. Use the `group()` iterator to access all pairs. See the spec for syntax
114//!     and examples.
115//!
116//!     This special casing rule allows multiple subject + content groups with only 1 closing
117//!     annotation marker.
118//!
119//! Learn More:
120//!
121//!     - Verbatim blocks spec: specs/v1/elements/verbatim.lex
122//!
123
124use super::super::range::{Position, Range};
125use super::super::text_content::TextContent;
126use super::super::traits::{AstNode, Container, Visitor, VisualStructure};
127use super::annotation::Annotation;
128use super::container::VerbatimContainer;
129use super::content_item::ContentItem;
130use super::data::Data;
131use super::typed_content::VerbatimContent;
132use std::fmt;
133use std::slice;
134
135/// Represents the mode of a verbatim block.
136#[derive(Debug, Clone, Copy, PartialEq, Eq)]
137pub enum VerbatimBlockMode {
138    /// The block's content is indented relative to the subject line.
139    Inflow,
140    /// The block's content starts at a fixed, absolute column.
141    Fullwidth,
142}
143
144/// A verbatim block represents content from another format/system.
145#[derive(Debug, Clone, PartialEq)]
146pub struct Verbatim {
147    /// Subject line of the first group (backwards-compatible direct access)
148    pub subject: TextContent,
149    /// Content lines of the first group (backwards-compatible direct access)
150    pub children: VerbatimContainer,
151    /// Closing data shared by all groups
152    pub closing_data: Data,
153    /// Annotations attached to this verbatim block
154    pub annotations: Vec<Annotation>,
155    /// Location spanning all groups and the closing data
156    pub location: Range,
157    /// The rendering mode of the verbatim block.
158    pub mode: VerbatimBlockMode,
159    /// Additional subject/content pairs beyond the first (for multi-group verbatims)
160    additional_groups: Vec<VerbatimGroupItem>,
161}
162
163impl Verbatim {
164    fn default_location() -> Range {
165        Range::new(0..0, Position::new(0, 0), Position::new(0, 0))
166    }
167
168    pub fn new(
169        subject: TextContent,
170        children: Vec<VerbatimContent>,
171        closing_data: Data,
172        mode: VerbatimBlockMode,
173    ) -> Self {
174        Self {
175            subject,
176            children: VerbatimContainer::from_typed(children),
177            closing_data,
178            annotations: Vec::new(),
179            location: Self::default_location(),
180            mode,
181            additional_groups: Vec::new(),
182        }
183    }
184
185    pub fn with_subject(subject: String, closing_data: Data) -> Self {
186        Self {
187            subject: TextContent::from_string(subject, None),
188            children: VerbatimContainer::empty(),
189            closing_data,
190            annotations: Vec::new(),
191            location: Self::default_location(),
192            mode: VerbatimBlockMode::Inflow,
193            additional_groups: Vec::new(),
194        }
195    }
196
197    pub fn marker(subject: String, closing_data: Data) -> Self {
198        Self {
199            subject: TextContent::from_string(subject, None),
200            children: VerbatimContainer::empty(),
201            closing_data,
202            annotations: Vec::new(),
203            location: Self::default_location(),
204            mode: VerbatimBlockMode::Inflow,
205            additional_groups: Vec::new(),
206        }
207    }
208
209    /// Preferred builder
210    pub fn at(mut self, location: Range) -> Self {
211        self.location = location;
212        self
213    }
214
215    /// Attach additional verbatim group entries beyond the first pair.
216    pub fn with_additional_groups(mut self, groups: Vec<VerbatimGroupItem>) -> Self {
217        self.additional_groups = groups;
218        self
219    }
220
221    /// Mutable access to the additional verbatim groups beyond the first.
222    pub fn additional_groups_mut(&mut self) -> std::slice::IterMut<'_, VerbatimGroupItem> {
223        self.additional_groups.iter_mut()
224    }
225
226    /// Annotations attached to this verbatim block.
227    pub fn annotations(&self) -> &[Annotation] {
228        &self.annotations
229    }
230
231    /// Mutable access to verbatim annotations.
232    pub fn annotations_mut(&mut self) -> &mut Vec<Annotation> {
233        &mut self.annotations
234    }
235
236    /// Iterate annotation blocks attached to this verbatim block.
237    pub fn iter_annotations(&self) -> std::slice::Iter<'_, Annotation> {
238        self.annotations.iter()
239    }
240
241    /// Iterate all content items nested inside verbatim annotations.
242    pub fn iter_annotation_contents(&self) -> impl Iterator<Item = &ContentItem> {
243        self.annotations
244            .iter()
245            .flat_map(|annotation| annotation.children())
246    }
247
248    /// Returns an iterator over each subject/content pair in the group order.
249    pub fn group(&self) -> VerbatimGroupIter<'_> {
250        VerbatimGroupIter {
251            first_yielded: false,
252            verbatim: self,
253            rest: self.additional_groups.iter(),
254        }
255    }
256
257    /// Returns the number of subject/content pairs held by this verbatim block.
258    pub fn group_len(&self) -> usize {
259        1 + self.additional_groups.len()
260    }
261}
262
263impl AstNode for Verbatim {
264    fn node_type(&self) -> &'static str {
265        "VerbatimBlock"
266    }
267    fn display_label(&self) -> String {
268        let subject_text = self.subject.as_string();
269        if subject_text.chars().count() > 50 {
270            format!("{}…", subject_text.chars().take(50).collect::<String>())
271        } else {
272            subject_text.to_string()
273        }
274    }
275    fn range(&self) -> &Range {
276        &self.location
277    }
278
279    fn accept(&self, visitor: &mut dyn Visitor) {
280        visitor.visit_verbatim_block(self);
281        // Visit all groups, not just the first
282        for group in self.group() {
283            visitor.visit_verbatim_group(&group);
284            super::super::traits::visit_children(visitor, group.children);
285            visitor.leave_verbatim_group(&group);
286        }
287        visitor.leave_verbatim_block(self);
288    }
289}
290
291impl VisualStructure for Verbatim {
292    fn is_source_line_node(&self) -> bool {
293        true
294    }
295
296    fn has_visual_header(&self) -> bool {
297        true
298    }
299}
300
301impl Container for Verbatim {
302    fn label(&self) -> &str {
303        self.subject.as_string()
304    }
305
306    fn children(&self) -> &[ContentItem] {
307        &self.children
308    }
309
310    fn children_mut(&mut self) -> &mut Vec<ContentItem> {
311        self.children.as_mut_vec()
312    }
313}
314
315impl fmt::Display for Verbatim {
316    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
317        let group_count = self.group_len();
318        let group_word = if group_count == 1 { "group" } else { "groups" };
319        write!(
320            f,
321            "VerbatimBlock('{}', {} {}, closing: {})",
322            self.subject.as_string(),
323            group_count,
324            group_word,
325            self.closing_data.label.value
326        )
327    }
328}
329
330/// Stored representation of additional verbatim group entries
331#[derive(Debug, Clone, PartialEq)]
332pub struct VerbatimGroupItem {
333    pub subject: TextContent,
334    pub children: VerbatimContainer,
335}
336
337impl VerbatimGroupItem {
338    pub fn new(subject: TextContent, children: Vec<VerbatimContent>) -> Self {
339        Self {
340            subject,
341            children: VerbatimContainer::from_typed(children),
342        }
343    }
344}
345
346/// Immutable view over a verbatim group entry.
347#[derive(Debug, Clone)]
348pub struct VerbatimGroupItemRef<'a> {
349    pub subject: &'a TextContent,
350    pub children: &'a VerbatimContainer,
351}
352
353/// Iterator over all subject/content pairs inside a verbatim block.
354pub struct VerbatimGroupIter<'a> {
355    first_yielded: bool,
356    verbatim: &'a Verbatim,
357    rest: slice::Iter<'a, VerbatimGroupItem>,
358}
359
360impl<'a> Iterator for VerbatimGroupIter<'a> {
361    type Item = VerbatimGroupItemRef<'a>;
362
363    fn next(&mut self) -> Option<Self::Item> {
364        if !self.first_yielded {
365            self.first_yielded = true;
366            return Some(VerbatimGroupItemRef {
367                subject: &self.verbatim.subject,
368                children: &self.verbatim.children,
369            });
370        }
371
372        self.rest.next().map(|item| VerbatimGroupItemRef {
373            subject: &item.subject,
374            children: &item.children,
375        })
376    }
377}