Expand description
This crate provides a push based XML parser library that adheres to XML5 specification. In other words this library trades well-formedness for error recovery.
The idea behind this, was to minimize number of errors from
tools that generate XML (e.g. S
won’t just return S
as text, but will parse it into S
).
You can check out full specification here.
What this library provides is a solid XML parser that can:
- Parse somewhat erroneous XML input
- Provide support for Numeric character references.
- Provide partial XML namespace support.
- Provide full set of SVG/MathML entities
What isn’t in scope for this library:
- Document Type Definition parsing - this is pretty hard to do right and nowadays, its used
Modules§
- buffer_
queue - The
BufferQueue
struct and helper types. - data
- driver
- Driver
- interface
- Types for tag and attribute names, and tree-builder functionality.
- serialize
- Serializer for XML5.
- smallcharset
- This module contains a single struct
SmallCharSet
. See its documentation for details. - tendril
- tokenizer
- XML5 tokenizer - converts input into tokens
- tree_
builder - XML5 tree builder - converts tokens into a tree like structure
Macros§
- expanded_
name - Helper to quickly create an expanded name.
- local_
name - Takes a local name as a string and returns its key in the string cache.
- namespace_
prefix - Takes a namespace prefix string and returns its key in a string cache.
- namespace_
url - Takes a namespace url string and returns its key in a string cache.
- ns
- Maps the input of
namespace_prefix!
to the output ofnamespace_url!
. - small_
char_ set - Create a
SmallCharSet
, with each space-separated number stored in the set.
Structs§
- Attribute
- A tag attribute, e.g.
class="test"
in<div class="test" ...>
. - Expanded
Name - An expanded name, containing the tag and the namespace.
- Local
Name Static Set - Namespace
Static Set - Prefix
Static Set - Qual
Name - A fully qualified name (with a namespace), used to depict names of tags and attributes.
- Small
Char Set - Represents a set of “small characters”, those with Unicode scalar values less than 64.