#[repr(u8)]pub enum State {
Show 19 variants
Data = 0,
TagOpen = 1,
TagName = 2,
EndTagOpen = 3,
EndTagName = 4,
BeforeAttrName = 5,
AttrName = 6,
AfterAttrName = 7,
BeforeAttrValue = 8,
AttrValueQuoted = 9,
AttrValueUnquoted = 10,
SelfClosingStartTag = 11,
MarkupDecl = 12,
Comment = 13,
CommentEndDash = 14,
CommentEnd = 15,
Doctype = 16,
CData = 17,
RawText = 18,
}Expand description
Tokenizer states — models the HTML5 tokenizer states relevant for our structural-index-driven approach.
Variants§
Data = 0
Outside any tag — consuming text content.
TagOpen = 1
Saw < — deciding if open tag, close tag, comment, or doctype.
TagName = 2
Inside an open tag name (e.g. reading div in <div).
EndTagOpen = 3
Saw </ — expecting a close tag name.
EndTagName = 4
Inside a close tag name.
BeforeAttrName = 5
After tag name, before attribute name or >.
AttrName = 6
Inside an attribute name.
AfterAttrName = 7
After attribute name, before = or next attribute.
BeforeAttrValue = 8
Saw = after attribute name — expecting value.
AttrValueQuoted = 9
Inside a quoted attribute value.
AttrValueUnquoted = 10
Inside an unquoted attribute value.
SelfClosingStartTag = 11
Saw / inside a tag — expecting > for self-closing.
MarkupDecl = 12
Inside <! — detecting comment vs doctype vs CDATA.
Comment = 13
Inside <!-- comment body.
CommentEndDash = 14
Saw first - at end of comment (-).
CommentEnd = 15
Saw -- at end of comment.
Doctype = 16
Inside <!DOCTYPE content.
CData = 17
Inside <![CDATA[ content.
RawText = 18
Inside raw text elements (<script>, <style>).