Expand description
August is a library for converting HTML to plain text.
§Design
The main goal of this library is to provide readable and efficent results when converting HTML emails into text, and so it is designed with that in mind. For example
- There’s no way to reliably convert the output of this program back into HTML. Adding the extra markup for that impeeds readability and isn’t useful in an email anyway.
- A fair bit of work is done to make sure that tables are rendered nicely. Emails often use tables for layout because CSS support is patchy.
- We try hard to get whitespace correct so you don’t end up withtextlikethis or like this around element boundaries.
§Limitations
- Currently we don’t support CSS at all
- There are a few elements <bdo>, <sup>, and <sub> that we should support but don’t.
- We don’t support <ruby> and related elements. Ruby was intentionally designed to fallback, so that’s probably fine.
§Usage
Just call the convert
or
convert_io
functions.
Functions§
- convert
- Converts HTML text into plain text
- convert_
dom - Converts a loaded markup5ever DOM into a text string
- convert_
dom_ io - Take a loaded markup5ever DOM, and send the converted text to an I/O writer
- convert_
dom_ io_ unstyled - Take a loaded markup5ever DOM, and send the converted unstyled text to an I/O writer
- convert_
dom_ unstyled - Converts a loaded markup5ever DOM into an unstyled text string
- convert_
io - Converts HTML text into plain text, using an I/O reader & writer
- convert_
unstyled - Converts HTML text into unstyled plain text
Type Aliases§
- Width
- Grapheme width of text