Crate html5ever_ext [] [src]

html5_ext

This is set of unofficial extensions to the html5ever crate's RcDom and Node structs, including a minifying HTML5 serializer and support for ***CSS matching**.

It re-exports the css and html5ever crates, and useful DOM types hidden inside the ::html5ever::markup5ever::rcdom module.

How Tos

To load and minify HTML5

extern crate html5_ext;
use ::html5ever_ext::RcDom;
use ::html5ever_ext::RcDomExt;
use ::html5ever_ext::Minify;

let rc_dom = RcDom::from_file_path_verified_and_stripped_of_comments_and_processing_instructions_and_with_a_sane_doc_type("/path/to/document.html").expect("invalid HTML");
rc_dom.minify_to_file_path();

There are additional methods available on Minify to minify to a byte array or a generic Write-implementing writer.

For more control, eg over serializing multiple node graphs, use the struct UltraMinifyingHtmlSerializer directly.

To match CSS selectors

extern crate html5_ext;
use ::html5ever_ext::RcDom;
use ::html5ever_ext::RcDomExt;
use ::html5ever_ext::parse_css_selector;
use ::html5ever_ext::Selectable;
use ::html5ever_ext::Minify;

let rc_dom = RcDom::from_file_path_verified_and_stripped_of_comments_and_processing_instructions_and_with_a_sane_doc_type("/path/to/document.html").expect("invalid HTML");

let selector = parse_css_selector("p.myclass").unwrap();

assert!(!rc_dom.matches(&selector));

rc_dom.find_all_matching_child_nodes_depth_first_including_this_one(&selector, |node|
{
    //Minify is implemented on node.children as well as node and rc_dom.
    eprintln!("{}", node.children.debug_string());

    const SHORTCUT: bool = false;
    SHORTCUT
})

To work with Nodes

Use the NodeExt, Minify, Selectable and QualNameExt traits.

Reexports

pub extern crate css;
pub extern crate html5ever;
pub use html5ever::interface::AppendNode;
pub use html5ever::interface::AppendText;

Structs

Attribute

A tag attribute.

Node

A DOM node.

QualName

A name with a namespace. Fully qualified name. Used to depict names of tags and attributes.

RcDom

The DOM itself; the result of parsing.

UltraMinifyingHtmlSerializer

A serializer that, unlike that in the html5ever crate (which is private and used via ::html5ever::serialize::serialize()), tries hard to minify HTML and make it compression-friendly. Use this struct directly if you need to serialize multiple nodes or doms to one writer, or control when flushing of the output writer should occur. Otherwise, use the trait Minify.

Enums

HtmlError

Represents errors that can happen within loading or minifying HTML.

NodeData

The different kinds of nodes in the DOM.

Traits

Minify

Minifies and serializes a html5ever HTML DOM (RcDom) or node (Rc, aka Handle).

NodeExt

This trait adds additional methods to a a HTML DOM node.

QualNameExt

Additional methods to work with QualName

RcDomExt

This trait adds additional methods to a HTML DOM.

Selectable

This trait adds methods for finding DOM nodes matching a CSS selector

TreeSink

Functions

parse_css_selector

Parses a CSS selector

Type Definitions

LocalName
StrTendril

Tendril for storing native Rust strings.