Crate html5ever_ext [] [src]

html5_ext

This is a set of unofficial extensions to the html5ever crate's RcDom and Node structs, including a minifying HTML5 serializer and support for CSS matching.

It re-exports the css and html5ever crates, and useful DOM types hidden inside the ::html5ever::markup5ever::rcdom module.

How Tos

To load and minify HTML5

extern crate html5_ext;
use ::html5ever_ext::RcDom;
use ::html5ever_ext::RcDomExt;
use ::html5ever_ext::Minify;

let rc_dom = RcDom::from_file_path_verified_and_stripped_of_comments_and_processing_instructions_and_with_a_sane_doc_type("/path/to/document.html").expect("invalid HTML");
rc_dom.minify_to_file_path();

There are additional methods available on Minify to minify to a byte array or a generic Write-implementing writer.

For more control, eg over serializing multiple node graphs, use the struct UltraMinifyingHtmlSerializer directly.

To match CSS selectors

extern crate html5_ext;
use ::html5ever_ext::RcDom;
use ::html5ever_ext::RcDomExt;
use ::html5ever_ext::parse_css_selector;
use ::html5ever_ext::Selectable;
use ::html5ever_ext::Minify;

let rc_dom = RcDom::from_file_path_verified_and_stripped_of_comments_and_processing_instructions_and_with_a_sane_doc_type("/path/to/document.html").expect("invalid HTML");

let selector = parse_css_selector("p.myclass").unwrap();

assert!(!rc_dom.matches(&selector));

rc_dom.find_all_matching_child_nodes_depth_first_including_this_one(&selector, |node|
{
    //Minify is implemented on node.children as well as node and rc_dom.
    eprintln!("{}", node.children.debug_string());

    const SHORTCUT: bool = false;
    SHORTCUT
})

To work with Nodes

Use the NodeExt, Minify, Selectable and QualNameExt traits.

Re-exports

pub extern crate css;
pub extern crate either;
pub extern crate html5ever;
pub use html5ever::interface::AppendNode;
pub use html5ever::interface::AppendText;

Structs

Attribute

A tag attribute.

Node

A DOM node.

Parser

An HTML parser, ready to receive Unicode input through the tendril::TendrilSink trait’s methods.

QualName

A name with a namespace. Fully qualified name. Used to depict names of tags and attributes.

RcDom

The DOM itself; the result of parsing.

UltraMinifyingHtmlSerializer

A serializer that, unlike that in the html5ever crate (which is private and used via ::html5ever::serialize::serialize()), tries hard to minify HTML and make it compression-friendly. Use this struct directly if you need to serialize multiple nodes or doms to one writer, or control when flushing of the output writer should occur. Otherwise, use the trait Minify.

UnattachedNode

Represents the structure of nodes unattached to a DOM. Designed to make it easy to create an entire graph of nodes before adding it.

Enums

AriaRole

Valid values of Aria role global attribute See Aria Roles 101 for more. Navigation roles are probably the most useful:-

Dir

Valid values of dir global attribute

Draggable

Valid values of draggable global attribute

HtmlError

Represents errors that can happen within loading or minifying HTML.

NodeData

The different kinds of nodes in the DOM.

Traits

AttributeExt

Additional methods for working with attributes

LocalNameExt

Additional helpers to make LocalName more pleasant to work with

Minify

Minifies and serializes a html5ever HTML DOM (RcDom) or node (Rc, aka Handle).

NodeExt

This trait adds additional methods to a a HTML DOM node.

QualNameExt

Additional methods to work with QualName

QualNameOnlyExt

Additional methods solely for QualName

RcDomExt

This trait adds additional methods to a HTML DOM.

Selectable

This trait adds methods for finding DOM nodes matching a CSS selector

TreeSink
UnattachedNodeExt

Helper trait to make it easier to turn UnattachedNodes into DOMs and HTML fragments

Functions

parse_css_selector

Parses a CSS selector

Type Definitions

LocalName
StrTendril

Tendril for storing native Rust strings.