Skip to main content

Visitor

Trait Visitor 

Source
pub trait Visitor {
    type Error;

Show 25 methods // Provided methods fn start_tag_open( &mut self, name: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn attribute_name( &mut self, name: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn attribute_value( &mut self, value: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn attribute_end(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn attribute_entity_ref( &mut self, name: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn attribute_char_ref( &mut self, value: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn start_tag_close(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn empty_element_end(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn end_tag(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error> { ... } fn characters(&mut self, text: &[u8], span: Span) -> Result<(), Self::Error> { ... } fn entity_ref(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error> { ... } fn char_ref(&mut self, value: &[u8], span: Span) -> Result<(), Self::Error> { ... } fn cdata_start(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn cdata_content( &mut self, text: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn cdata_end(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn comment_start(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn comment_content( &mut self, text: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn comment_end(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn xml_declaration( &mut self, version: &[u8], encoding: Option<&[u8]>, standalone: Option<bool>, span: Span, ) -> Result<(), Self::Error> { ... } fn pi_start(&mut self, target: &[u8], span: Span) -> Result<(), Self::Error> { ... } fn pi_content(&mut self, data: &[u8], span: Span) -> Result<(), Self::Error> { ... } fn pi_end(&mut self, span: Span) -> Result<(), Self::Error> { ... } fn doctype_start( &mut self, name: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn doctype_content( &mut self, content: &[u8], span: Span, ) -> Result<(), Self::Error> { ... } fn doctype_end(&mut self, span: Span) -> Result<(), Self::Error> { ... }
}
Expand description

Trait for receiving fine-grained XML parsing events.

All &[u8] slices are references into the caller’s buffer. Return Ok(()) to continue parsing, or Err(Self::Error) to abort.

Default implementations do nothing and return Ok(()).

§Callback sequences

The parser emits callbacks in these patterns for each XML construct. * means zero or more, + means one or more, | means alternatives.

§Start tag

start_tag_open
  (attribute_name  attribute-value-sequence  attribute_end)*
  start_tag_close | empty_element_end

<img src="a.png" alt="pic"/>:

start_tag_open("img")
attribute_name("src")
attribute_value("a.png")
attribute_end
attribute_name("alt")
attribute_value("pic")
attribute_end
empty_element_end

<p>:

start_tag_open("p")
start_tag_close

§Attribute value sequence

Between the quotes of a single attribute:

(attribute_value | attribute_entity_ref | attribute_char_ref)*

The value is segmented at entity/character reference boundaries and at buffer boundaries. Empty attribute values and values consisting solely of references produce zero attribute_value calls. attribute_end always fires exactly once per attribute, after the closing quote.

class="a&amp;b":

attribute_name("class")
attribute_value("a")
attribute_entity_ref("amp")
attribute_value("b")
attribute_end

v="&amp;" (ref-only - no attribute_value calls):

attribute_name("v")
attribute_entity_ref("amp")
attribute_end

v="" (empty - no attribute_value calls):

attribute_name("v")
attribute_end

§End tag

</div>:

end_tag("div")

§Text content

When present between markup:

(characters | entity_ref | char_ref)+

Not all elements have text content - e.g. <p></p> produces no text events between start_tag_close and end_tag. When text is present, it is segmented at entity/character reference boundaries and at buffer boundaries. There is no trailing characters call after a final reference.

hello &amp; world:

characters("hello ")
entity_ref("amp")
characters(" world")

&lt;&gt; (references only - no characters calls):

entity_ref("lt")
entity_ref("gt")

§CDATA section

cdata_start → cdata_content* → cdata_end

<![CDATA[hello]]>:

cdata_start
cdata_content("hello")
cdata_end

<![CDATA[]]> (empty - no cdata_content call):

cdata_start
cdata_end

§Comment

comment_start → comment_content* → comment_end

<!-- hi -->:

comment_start
comment_content(" hi ")
comment_end

<!----> (empty - no comment_content call):

comment_start
comment_end

§Processing instruction

pi_start → pi_content* → pi_end

Leading whitespace between the target and content is consumed by the parser and not included in pi_content.

<?pi data?>:

pi_start("pi")
pi_content("data")
pi_end

<?x?> (no content - no pi_content call):

pi_start("x")
pi_end

§DOCTYPE declaration

doctype_start → doctype_content* → doctype_end

Content is opaque (not further parsed).

<!DOCTYPE html [<!ENTITY foo "bar">]>:

doctype_start("html")
doctype_content(" [<!ENTITY foo \"bar\">]")
doctype_end

<!DOCTYPE html> (no content - no doctype_content call):

doctype_start("html")
doctype_end

§XML declaration

A single xml_declaration call (never chunked).

<?xml version="1.0" encoding="UTF-8"?>:

xml_declaration(version="1.0", encoding=Some("UTF-8"), standalone=None)

For CDATA, comments, PIs, and DOCTYPE, the content callback may fire more than once when the content spans buffer boundaries.

Required Associated Types§

Provided Methods§

Source

fn start_tag_open(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error>

Start tag opened: <name. name is the element name (may include a namespace prefix and :).

Source

fn attribute_name(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error>

Attribute name within a start tag.

Source

fn attribute_value( &mut self, value: &[u8], span: Span, ) -> Result<(), Self::Error>

Attribute value text (between entity/char ref boundaries or buffer boundaries). The surrounding quotes are not included.

Called zero or more times per attribute, segmented at entity/char reference boundaries and buffer boundaries. Not called for empty segments - an attribute whose value is empty or consists entirely of references produces zero attribute_value calls.

Source

fn attribute_end(&mut self, span: Span) -> Result<(), Self::Error>

End of an attribute value (the closing quote was consumed).

Source

fn attribute_entity_ref( &mut self, name: &[u8], span: Span, ) -> Result<(), Self::Error>

Entity reference in attribute value: &name;. name is the entity name without & and ;.

Source

fn attribute_char_ref( &mut self, value: &[u8], span: Span, ) -> Result<(), Self::Error>

Character reference in attribute value: &#NNN; or &#xHHH;. value is the raw text between &# and ; (e.g. "60" or "x3C").

Source

fn start_tag_close(&mut self, span: Span) -> Result<(), Self::Error>

Start tag closed with >.

Source

fn empty_element_end(&mut self, span: Span) -> Result<(), Self::Error>

Empty element closed with />.

Source

fn end_tag(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error>

End tag: </name>. name is the element name (may include a namespace prefix and :).

Source

fn characters(&mut self, text: &[u8], span: Span) -> Result<(), Self::Error>

Character data between markup.

May be called multiple times for a single run of text content - interleaved with entity_ref and char_ref calls at reference boundaries, and split at buffer boundaries. For example, a&amp;b produces characters("a"), entity_ref("amp"), characters("b").

Each text slice is guaranteed to not split a multi-byte UTF-8 character at its boundaries (except when is_final is true and the document ends mid-sequence). If the input is valid UTF-8, std::str::from_utf8(text) will always succeed.

Source

fn entity_ref(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error>

Entity reference in text content: &name;. name is the entity name without & and ;.

Source

fn char_ref(&mut self, value: &[u8], span: Span) -> Result<(), Self::Error>

Character reference in text content: &#NNN; or &#xHHH;. value is the raw text between &# and ; (e.g. "60" or "x3C").

Source

fn cdata_start(&mut self, span: Span) -> Result<(), Self::Error>

Start of a CDATA section: <![CDATA[.

Source

fn cdata_content(&mut self, text: &[u8], span: Span) -> Result<(), Self::Error>

Content within a CDATA section. Called zero or more times for a single CDATA section - zero for empty sections (<![CDATA[]]>), and possibly more than once when content spans buffer boundaries. Consecutive calls have contiguous spans.

Source

fn cdata_end(&mut self, span: Span) -> Result<(), Self::Error>

End of a CDATA section: ]]>.

Source

fn comment_start(&mut self, span: Span) -> Result<(), Self::Error>

Start of a comment: <!--.

Source

fn comment_content( &mut self, text: &[u8], span: Span, ) -> Result<(), Self::Error>

Content within a comment. Called zero or more times for a single comment - zero for empty comments (<!---->), and possibly more than once when content spans buffer boundaries. Consecutive calls have contiguous spans.

Source

fn comment_end(&mut self, span: Span) -> Result<(), Self::Error>

End of a comment: -->.

Source

fn xml_declaration( &mut self, version: &[u8], encoding: Option<&[u8]>, standalone: Option<bool>, span: Span, ) -> Result<(), Self::Error>

XML declaration: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>.

Fired instead of PI callbacks when <?xml ...?> appears at the document start. Per the XML specification, the XML declaration is NOT a processing instruction - it is a distinct construct.

version is always present (e.g. b"1.0"). encoding and standalone are optional.

Source

fn pi_start(&mut self, target: &[u8], span: Span) -> Result<(), Self::Error>

Start of a processing instruction: <?target. target is the PI target name.

Source

fn pi_content(&mut self, data: &[u8], span: Span) -> Result<(), Self::Error>

Content of a processing instruction (everything between target and ?>). Called zero or more times for a single PI - zero when the PI has no content (<?target?>), and possibly more than once when content spans buffer boundaries. Consecutive calls have contiguous spans.

Source

fn pi_end(&mut self, span: Span) -> Result<(), Self::Error>

End of a processing instruction: ?>.

Source

fn doctype_start(&mut self, name: &[u8], span: Span) -> Result<(), Self::Error>

Start of a DOCTYPE declaration: <!DOCTYPE name. name is the root element name.

Source

fn doctype_content( &mut self, content: &[u8], span: Span, ) -> Result<(), Self::Error>

Content within a DOCTYPE declaration (opaque). Called zero or more times for a single DOCTYPE - zero for simple declarations (<!DOCTYPE html>), and possibly more than once when content spans buffer boundaries. Consecutive calls have contiguous spans.

Source

fn doctype_end(&mut self, span: Span) -> Result<(), Self::Error>

End of a DOCTYPE declaration: >.

Implementors§