Expand description
Inline IR for both CommonMark and Pandoc dialects.
The inline parsing pipeline runs in three passes over an intermediate representation (IR):
-
Scan (
build_ir): walk the source bytes once, producing a flatVec<IrEvent>. Opaque higher-precedence constructs (escapes, code spans, autolinks, raw HTML, plus Pandoc math / native spans / inline footnotes / footnote references / citations / bracketed spans) are skipped past as a singleIrEvent::Constructevent whose source range is preserved for losslessness. Delimiter runs (*/_), bracket markers ([,![,]), soft line breaks, and plain text spans become distinct events. -
Process brackets (
process_brackets) — CommonMark §6.3: the bracket-stack algorithm walks]markers left-to-right. For each], the algorithm finds the nearest active opener and tries to resolve the pair as a link or image: inline[text](dest), full reference[text][label], collapsed[text][], or shortcut[text]. Under CommonMark, reference forms are validated against the document refdef map and a successful match deactivates all earlier active openers (§6.3 “links may not contain other links”). Under Pandoc, reference forms resolve shape-only (any non-empty label) and the deactivation pass is skipped; outer-wins nested-link semantics are enforced by the emission walk’ssuppress_inner_linksflag instead. -
Process emphasis (
process_emphasis_in_range): the classic delimiter-stack algorithm runs over theIrEvent::DelimRunevents, pairing openers with closers and recording matches on the runs. Runs first scoped per resolved bracket pair (innermost first), then a top-level pass over the residual events. Each match consumes 1 or 2 inner-edge bytes from each side; leftover bytes fall through to literal text. Dialect gates (Pandoc flanking rules, mod-3 rejection, asymmetric (1,2)/(2,1) rejection, opener-count >= 4 rejection, triple-emph nesting flip, cascade-then-rerun) branch on thedialectparameter.
The emission walk in [super::core::parse_inline_range_impl] consumes
three byte-keyed plans built by build_full_plans: an
EmphasisPlan for delim-run dispositions, a BracketPlan for
resolved link/image bracket pairs, and a ConstructPlan for
standalone Pandoc constructs (inline footnotes, native spans, footnote
references, citations, bracketed spans). Matched delim runs become
EMPHASIS / STRONG nodes; matched bracket pairs become LINK /
IMAGE nodes via the dispatcher’s try_parse_* recognizers (called
to parse a matched range, not to resolve it). Unmatched delims and
brackets fall through to plain text.
Structs§
- Bracket
Plan - A byte-keyed view of the IR’s bracket resolutions.
- Bracket
Resolution - Successful bracket resolution: the
[…]pair is a link or image. - Construct
Plan - A byte-keyed view of the IR’s standalone Pandoc constructs that the
emission walk consumes directly: inline footnotes, native spans,
footnote references, bracketed citations, bare citations, and
bracketed spans. Recognition is authoritative in
build_irunderDialect::Pandoc; the dispatcher’s legacy branches for these constructs (^[,<span>,[^id],[@cite],@cite/-@cite,[text]{attrs}) are gated toDialect::CommonMarkonly and only fire when the relevant extension is explicitly enabled. - Delim
Match - One matched fragment within a
IrEvent::DelimRun. - Emphasis
Plan - Byte-keyed disposition map for
*/_delimiter chars produced by the IR’s emphasis pass and consumed by the inline emission walk. - Inline
Plans - Bundle of plans produced by
build_full_plansand consumed by the inline emission walk.
Enums§
- Bracket
Dispo - Disposition of a single bracket byte after
process_brackets. - Construct
Dispo - A standalone Pandoc inline construct recognised by
build_irand dispatched directly from the emission walk. Carries the construct’s full source range so the emission walk can slice the content for the existingemit_*helpers without re-running the recognition. - Construct
Kind - Categorical tag for a
IrEvent::Constructevent so emission knows which parser to call to rebuild the CST subtree. - Delim
Char - Disposition of a single delimiter byte after emphasis resolution.
- Emphasis
Kind - IrEvent
- One event in the inline IR.
- Link
Kind - What kind of link/image we resolved a bracket pair to.
Functions§
- build_
bracket_ plan - Build a
BracketPlanfrom the resolved IR. EachOpenBracketresolution becomes anBracketDispo::Openkeyed at the opener’s start byte. Unresolved openers and unmatched closers becomeBracketDispo::Literalso the emission path can recognise them without re-parsing. - build_
construct_ plan - Build a
ConstructPlanfrom the resolved IR. EachConstruct { kind: InlineFootnote | NativeSpan, .. }becomes one entry keyed at its start byte. - build_
emphasis_ plan - Convert the IR’s delim-run match decisions into an
EmphasisPlan, preserving the byte-keyed disposition shape the existing emission walk consumes. - build_
full_ plans - One-shot helper: build the IR, run all passes, and return the
bundled
InlinePlans(emphasis dispositions, bracket resolutions, and standalone Pandoc constructs) — packaged together so the inline emission path can consume them in one go for either dialect. - build_
ir - Scan
text[start..end]once, producing a flat IR of events. - process_
brackets - Resolve
[/![/]markers into link/image nodes per CommonMark §6.3 (with Pandoc-aware variations underDialect::Pandoc). - process_
emphasis - Run the CommonMark §6.3
process_emphasisalgorithm over the IR’s delim runs. Mutates the IR in place: matched runs gain entries in theirmatchesvec, unmatched bytes stay implicit (the emission pass treats any byte not covered by a match as literal text). - process_
emphasis_ in_ range - Range-scoped variant of
process_emphasis.