Expand description
Convert HTML to text formats.
This crate renders HTML into a text format, wrapped to a specified width. This can either be plain text or with extra annotations to (for example) show in a terminal which supports colours.
§Examples
let html = b"
<ul>
<li>Item one</li>
<li>Item two</li>
<li>Item three</li>
</ul>";
assert_eq!(from_read(&html[..], 20),
"\
* Item one
* Item two
* Item three
");
A couple of simple demonstration programs are included as examples:
§html2text
The simplest example uses from_read
to convert HTML on stdin into plain
text:
$ cargo run --example html2text < foo.html
[...]
§html2term
A very simple example of using the rich interface (from_read_rich
) for a
slightly interactive console HTML viewer is provided as html2term
.
$ cargo run --example html2term foo.html
[...]
Note that this example takes the HTML file as a parameter so that it can read keys from stdin.
Modules§
- Configure the HTML to text translation using the
Config
type, which can be constructed using one of the functions in this module. - Module containing the
Renderer
interface for constructing a particular text output.
Structs§
- An RGB colour value
- The DOM itself; the result of parsing.
- Common fields from a node.
- A representation of a table render tree with metadata.
- Render tree table cell
- Render tree table row
- The structure of an HTML document that can be rendered using a
TextDecorator
. - A rendered HTML document.
- Size information/estimate
Enums§
- Errors from reading or rendering HTML
- The node-specific information distilled from the DOM.
Functions§
- Convert a DOM tree or subtree into a render tree.
- Reads HTML from
input
, and returns aString
with text wrapped towidth
columns. - Reads HTML from
input
, and returns text wrapped towidth
columns. The text is returned as aVec<TaggedLine<_>>
; the annotations are vectors ofRichAnnotation
. The “outer” annotation comes first in theVec
. - Reads HTML from
input
, and returns text wrapped towidth
columns. The text is returned as aVec<TaggedLine<_>>
; the annotations are vectors ofRichAnnotation
. The “outer” annotation comes first in theVec
. - Reads HTML from
input
, decorates it usingdecorator
, and returns aString
with text wrapped towidth
columns. - Reads and parses HTML from
input
and prepares a render tree.