Crate html5ever_ext [] [src]

html5_ext

This is set of unofficial extensions to the html5ever crate's RcDom and Node structs, including a minifying HTML5 serializer and support for ***CSS matching**.

It re-exports the css and html5ever crates, and useful DOM types hidden inside the ::html5ever::markup5ever::rcdom module.

How Tos

To load and minify HTML5

extern crate html5_ext;
use ::html5ever_ext::RcDom;
use ::html5ever_ext::RcDomExt;
use ::html5ever_ext::Minify;

let rc_dom = RcDom::from_file_path_verified_and_stripped_of_comments_and_processing_instructions_and_with_a_sane_doc_type("/path/to/document.html").expect("invalid HTML");
rc_dom.minify_to_file_path();

There are additional methods available on Minify to minify to a byte array or a generic Write-implementing writer.

For more control, eg over serializing multiple node graphs, use the struct UltraMinifyingHtmlSerializer directly.

To match CSS selectors

extern crate html5_ext;
use ::html5ever_ext::RcDom;
use ::html5ever_ext::RcDomExt;
use ::html5ever_ext::parse_css_selector;
use ::html5ever_ext::Selectable;
use ::html5ever_ext::NodeExt;

let rc_dom = RcDom::from_file_path_verified_and_stripped_of_comments_and_processing_instructions_and_with_a_sane_doc_type("/path/to/document.html").expect("invalid HTML");

let selector = parse_css_selector("p.myclass").unwrap();

assert!(!rc_dom.matches(&selector));

rc_dom.find_all_matching_child_nodes_depth_first_including_this_one(&selector, |node|
{
    // Done this way because Rc<Node> does not implement Debug, but NodeExt implement debug_fmt() which is identical as possible.
    let mut debug = String::new();
    node.debug_fmt(&mut debug).unwrap();
    write!("Found node {}", &debug);
})

To work with Nodes

Use the NodeExt, Minify, Selectable and QualNameExt traits.

Reexports

pub extern crate css;
pub extern crate html5ever;

Structs

Attribute

A tag attribute.

Node

A DOM node.

QualName

A name with a namespace. Fully qualified name. Used to depict names of tags and attributes.

RcDom

The DOM itself; the result of parsing.

UltraMinifyingHtmlSerializer

A serializer that, unlike that in the html5ever crate (which is private and used via ::html5ever::serialize::serialize()), tries hard to minify HTML and make it compression-friendly. Use this struct directly if you need to serialize multiple nodes or doms to one writer, or control when flushing of the output writer should occur. Otherwise, use the trait Minify. This serializer will write value-omitted, quote-less and both single- and double-quoted attributes to minimise their length. This serializer does not know about namespaces; namespaces are just ignored (although prefixes are written). This serializer converts element names, attribute names and DTD names to ASCII lower-case.

Enums

HtmlError

Represents errors that can happen within loading or minifying HTML.

NodeData

The different kinds of nodes in the DOM.

Traits

Minify

Minifies and serializes a html5ever HTML DOM (RcDom) or node (Rc, aka Handle).

NodeExt

This trait adds additional methods to a a HTML DOM node.

QualNameExt

Additional methods to work with QualName

RcDomExt

This trait adds additional methods to a HTML DOM.

Selectable

This trait adds methods for finding DOM nodes matching a CSS selector

Functions

parse_css_selector

Parses a CSS selector

Type Definitions

LocalName