Expand description
Parsing for Pandoc-style attributes: {#id .class key=value}
Attributes can appear after headings, fenced code blocks, fenced divs, etc. Syntax: {#identifier .class1 .class2 key1=val1 key2=“val2”}
Rules:
- Surrounded by { }
- Identifier: #id (optional, only first one counts)
- Classes: .class (can have multiple)
- Key-value pairs: key=value or key=“value” or key=‘value’ (can have multiple)
- Whitespace flexible between items
Structs§
Functions§
- emit_
attribute_ node - Emit a Pandoc
{...}ATTRIBUTE node by STRUCTURING the raw source slice into ATTR_* children that wrap the original bytes (no synthesis). Markers and quotes stay inside their tokens; whitespace/newlines between components, and any bytes the scanner skips (duplicate#id, malformed tokens), become standalone WHITESPACE/NEWLINE/TEXT tokens — sonode.text()is exactly the source slice. Non-{...}-shaped or unrecognized input (MMD[#id]header brackets, raw-inline{=format}, empty{}) falls back to a single opaque ATTRIBUTE token, preserving the prior shape. - emit_
code_ info_ attrs - Structure a code-block info-string region containing a
{...}attribute block intoATTR_*children wrapping the source bytes, the same wayemit_attribute_nodedoes — but with a language carve-out: whencarve_first_class_as_languageis set, the first.classcomponent is emitted asTEXT "."+CODE_LANGUAGE <lang>(Pandoc’s{.python …}language-first shape) instead of anATTR_CLASS. - emit_
div_ info_ node - Emit a fenced-div
DIV_INFOnode, structuring the Pandoc{...}body the same wayemit_attribute_nodedoes. Bare-word shorthand (::: Warning) and malformed/empty bodies fall back to a single opaqueTEXTtoken, preserving the priorDIV_INFO { TEXT(...) }shape (and the bare-word class semantics the projector reads viaparse_div_info). - emit_
html_ attrs_ node - Emit a structural
HTML_ATTRSnode, wrapping the source bytes of each recognized HTML attribute inATTR_ID/ATTR_CLASS/ATTR_KEY_VALUEchildren (bare values — HTML has no#/.marker). Bytes between/around components (names,=, quotes, whitespace,/) become gap tokens, sonode.text()is exactlyattrs_text. An unrecognized/empty body falls back to a single opaqueTEXTtoken. - emit_
html_ span_ attributes_ node - As
emit_html_attrs_nodebut for the legacy native-spanSPAN_ATTRIBUTESnode, which carries HTMLclass="..."syntax (not Pandoc{...}). - emit_
span_ attributes_ node - Emit a bracketed-span
SPAN_ATTRIBUTESnode, structuring the Pandoc{...}body the same wayemit_attribute_nodedoes. Malformed/empty bodies fall back to a single opaqueTEXTtoken, preserving the priorSPAN_ATTRIBUTES { TEXT(...) }shape. - parse_
attribute_ content - Parse the content inside the attribute braces into owned strings. Thin
wrapper over [
attribute_content_spans] so detection and emission share one walk. - parse_
html_ attribute_ list - Parse a raw HTML attribute list (the bytes between a tag name and the
closing
>, exclusive). Accepts inputs likeid="x" class="a b" data-key=vand produces anAttributeBlock. ReturnsNoneif no recognized attributes are present. - parse_
html_ tag_ attributes - Parse HTML-style attributes from a raw HTML opening tag text such as
<div id="x" class="a b" data-key="v">, returning the sameAttributeBlockshape as Pandoc-style brace attributes. Whitespace- separatedclass="..."is split into individual classes;id="..."becomes the identifier; everything else becomes a key/value pair. ReturnsNoneif the tag has no recognized attributes. - try_
parse_ trailing_ attributes - Try to parse an attribute block from the end of a string Returns: (attribute_block, text_before_attributes)
- try_
parse_ trailing_ attributes_ with_ pos - Try to parse an attribute block from the end of a string. Returns: (attribute_block, text_before_attributes, open_brace_position_in_trimmed_text)