html5tokenizer
This crate provides the tokenizer form html5ever, repackaged with all of its dependencies removed. The following dependencies were removed:
-
markup5ever
buffer_queue
andsmallcharset
were merged into the source code -
tendril
According to its README it contains "a substantial amount of unsafe code". This fork replaces the tendril strings with plain oldstd::string::String
s. -
mac
The only macros actually needed (format_if
andtest_eq
) were merged into the source code. -
log
Was only used for debug output.
If you want to parse HTML into a tree (DOM) you should by all means use html5ever, this crate is merely for those who only want an HTML5 tokenizer and seek to minimize their compile dependencies (html5ever pulls in 56).
Credits
Thanks to the developers of html5ever for their awesome parser!