Expand description
Parsing, filtering, selecting and serializing HTML/XML markup.
See the project ../README for a feature overview.
Modules§
- filter
- Mutating visitor support for
Document
. - html
- Support for html5 parsing to
Document
. - xml
- Support for XML parsing to
Document
(xml feature).
Macros§
- chain_
filters - Compose a new filter closure, by chaining a list of 1 to many closures or
function paths. Each is executed in order, while the returned action remains
Action::Continue
, or otherwise terminated early.
Structs§
- Attribute
- A tag attribute, e.g.
class="test"
in<div class="test" ...>
. - Decoder
- A
TendrilSink
adaptor that takes bytes, decodes them as the given character encoding, while replacing any ill-formed byte sequences with U+FFFD replacement characters, and emits Unicode (StrTendril
). - Descender
- A depth-first iterator returned by
NodeRef::descendants
. - Document
- A DOM-like container for a tree of markup elements and text.
- Document
Type - Document type definition details.
- Element
- A markup element with name and attributes.
- Encoding
Hint - A set of confidence-weighted evidence that a text document is in a particular encoding.
- Node
- A typed node (e.g. text, element, etc.) within a
Document
including identifiers to parent, siblings and children. - NodeId
- A
Node
identifier as a u32 index into aDocument
sNode
vector. - NodeRef
- A
Node
withinDocument
lifetime reference. - Processing
Instruction - Processing instruction details.
- Qual
Name - A fully qualified name (with a namespace), used to depict names of tags and attributes.
- Selector
- A selecting iterator returned by
NodeRef::select
.
Enums§
- Node
Data - The node kind and payload data associated with that kind.
Constants§
- BOM_
CONF - Recommended confidence for hints based on a leading Byte-Order-Mark (BOM) at the start of a document stream.
- DEFAULT_
CONF - Recommended confidence for an initial default encoding.
- HTML_
META_ CONF - Recommended confidence for the sum of all hints from within an HTML head, in meta elements.
- HTTP_
CTYPE_ CONF - Recommended confidence for a hint from an HTTP Content-Type header with charset.
- INITIAL_
BUFFER_ SIZE - Initial parse buffer size in which encoding hints are considered, possibly triggering reparse.
- READ_
BUFFER_ SIZE - Subsequent parse buffer size used for reading and parsing, after
INITIAL_BUFFER_SIZE
.
Type Aliases§
- Local
Name - Namespace
- Shared
Encoding Hint - An
EncodingHint
that can be shared betweenDecoder
andSink
, by reference on the same thread, and internally mutated. The type is neitherSend
norSync
. - StrTendril
Tendril
for storing native Rust strings.