Crate html2pango
source ·Expand description
Library for sanitizing and converting HTML strings to something that Pango can render.
This library contains several functions to (pre)process text to Pango Markup. What to use and when depends on the type of input and the desired result. This can range from just escaping to converting and sanitizing. See the examples below for what is available based on the input type.
The functions below convert strings to strings. If your input can contain several block
elements such as headings, lists, code or quote blocks, see the block
module to convert an
input string into a list of these blocks.
Markdown/body HTML
To handle more HTML, use the markup_html
function. This function supports HTML body markup
such as HTML resulting from a Markdown-to-HTML conversion. It tries to convert the input to
Pango Markup such that rendering by Pango will make it similar like what a browser would.
This involves adding newlines for paragraphs and lists, converting font styles, etc.
let m = markup_html("<body>this is some <font color=\"#ff0000\">red text</font>!</body>").unwrap();
assert_eq!(m, "this is some <span foreground=\"#ff0000\">red text</span>!");
let m = markup_html("<body>a nice <a href=\"https://gnome.org\">link</a>").unwrap();
assert_eq!(m, "a nice <a href=\"https://gnome.org\">link</a>");
let m = markup_html("<body>some items: <ul><li>first</li><li>second</li></ul><body").unwrap();
assert_eq!(m, "some items: \n • first\n • second\n");
Escaping
To just escape any HTML reserved characters, use html_escape
:
let s = html_escape("this is a <tag> & this is \"quoted text\"");
assert_eq!(s, "this is a <tag> & this is "quoted text"");
Matrix custom HTML
For Matrix, its specification defines a custom HTML format that
specifies the tags and attributes that can be used. Use matrix_html_to_markup
to handle
this custom HTML input so that input is sanitized before it is converted.
This function is still work-in-progress!
Simple HTML
By simple HTML, we mean plain text that only contains some formatting tags such as
<strong>
, <i>
, <code>
, etc.
For the full list of supported tags and how they are replaced, see markup_from_raw
.
With sanitization
If you use markup
, supported tags are replaced (if necessary), malformed tags are removed
and HTML reserved characters are escaped.
let m = markup("<p><strong>this <i>is &sanitized<f;><unsupported/></i></strong></p>");
assert_eq!(m, "<b>this <i>is &sanitized</i></b>");
Other unsupported, but valid tags are escaped.
let m = markup("this is <span>a tag</span>");
assert_eq!(m, "this is <span>a tag</span>");
URIs are replaced by links.
let m = markup("go to: https://gnome.org");
assert_eq!(m, "go to: <a href=\"https://gnome.org\">https://gnome.org</a>");
Without sanitization
Use markup_from_raw
if you have already sanitized input:
let m = markup_from_raw("<p>this is <unsupported>already sanitized</unsupported></p>");
assert_eq!(m, "this is <unsupported>already sanitized</unsupported>");
Links
To just replace URIs by links, use markup_links
:
let m = markup_links("go to: https://gnome.org");
assert_eq!(m, "go to: <a href=\"https://gnome.org\">https://gnome.org</a>");
Modules
- Module for converting HTML markup to blocks with Pango Markup formatted content.
Functions
- Escapes the HTML reserved characters.
- Sanitizes and converts simple HTML markup to Pango Markup.
- Converts simple HTML markup to Pango Markup.
- Converts HTML body markup to Pango Markup.
- Replaces URIs by HTML link tags.
- Sanitizes and convert Matrix custom HTML markup (formatted body) to Pango Markup.