pub struct Settings<'h, 's> {
pub element_content_handlers: Vec<(Cow<'s, Selector>, ElementContentHandlers<'h>)>,
pub document_content_handlers: Vec<DocumentContentHandlers<'h>>,
pub encoding: AsciiCompatibleEncoding,
pub memory_settings: MemorySettings,
pub strict: bool,
pub enable_esi_tags: bool,
pub adjust_charset_on_meta_tag: bool,
}
Expand description
Specifies settings for HtmlRewriter
.
Fields§
§element_content_handlers: Vec<(Cow<'s, Selector>, ElementContentHandlers<'h>)>
Specifies CSS selectors and rewriting handlers for elements and their inner content.
Hint
element
, comments
and text
convenience macros can be used to construct a
(Selector, ElementContentHandlers)
tuple.
Example
use std::borrow::Cow;
use lol_html::{ElementContentHandlers, Settings};
let settings = Settings {
element_content_handlers: vec! [
(
Cow::Owned("div[foo]".parse().unwrap()),
ElementContentHandlers::default().element(|el| {
// ...
Ok(())
})
),
(
Cow::Owned("body".parse().unwrap()),
ElementContentHandlers::default().comments(|c| {
// ...
Ok(())
})
)
],
..Settings::default()
};
document_content_handlers: Vec<DocumentContentHandlers<'h>>
Specifies rewriting handlers for the content without associating it to a particular CSS selector.
Refer to DocumentContentHandlers
documentation for more information.
Hint
doctype
, doc_comments
and doc_text
convenience macros can be used to construct
items of this vector.
encoding: AsciiCompatibleEncoding
Specifies the character encoding for the input and the output of the rewriter.
Can be a label for any of the web-compatible encodings with an exception for UTF-16LE
,
UTF-16BE
, ISO-2022-JP
and replacement
(these non-ASCII-compatible encodings
are not supported).
Default
"utf-8"
when constructed with Settings::default()
.
memory_settings: MemorySettings
Specifies the memory settings.
strict: bool
If set to true
the rewriter bails out if it encounters markup that drives the HTML parser
into ambigious state.
Since the rewriter operates on a token stream and doesn’t have access to a full DOM-tree, there are certain rare cases of non-conforming HTML markup which can’t be guaranteed to be parsed correctly without an ability to backtrace the tree.
Therefore, due to security considerations, sometimes it’s preferable to abort the rewriting process in case of such uncertainty.
One of the simplest examples of such markup is the following:
...
<select><xmp><script>"use strict";</script></select>
...
The <xmp>
element is not allowed inside the <select>
element, so in a browser the start
tag for <xmp>
will be ignored and following <script>
element will be parsed and executed.
On the other hand, the <select>
element itself can be also ignored depending on the
context in which it was parsed. In this case, the <xmp>
element will not be ignored
and the <script>
element along with its content will be parsed as a simple text inside
it.
So, in this case the parser needs an ability to backtrace the DOM-tree to figure out the correct parsing context.
Default
true
when constructed with Settings::default()
.
adjust_charset_on_meta_tag: bool
If enabled the rewriter will dynamically change the charset when it encounters a meta
tag
that specifies the charset.
The charset can be modified by the meta
tag with
<meta charset="windows-1251">
or
<meta http-equiv="content-type" content="text/html; charset=windows-1251">
Note that an explicit charset
in the Content-type
header should take precedence over
the meta
tag, so only enable this if the content type does not explicitly specify a
charset. For details check this.
Default
false
when constructed with Settings::default()
.