Skip to main content

Crate contextual_encoder

Crate contextual_encoder 

Source
Expand description

contextual output encoding for XSS defense and safe literal embedding.

this crate provides context-aware encoding functions inspired by the OWASP Java Encoder. each function encodes input for safe embedding in a specific output context — web contexts (HTML, XML, JavaScript, CSS, URI) and source literal contexts (Java, Rust, Ruby).

disclaimer: contextual-encoder is an independent Rust crate. its API and security model are inspired by the OWASP Java Encoder, but this project is not affiliated with, endorsed by, or maintained by the OWASP Foundation.

§quick start

use contextual_encoder::{for_html, for_javascript, for_css_string, for_uri_component};

let user_input = "<script>alert('xss')</script>";

// safe for HTML text content and quoted attributes
let html_safe = for_html(user_input);
assert!(html_safe.contains("&lt;script&gt;"));

// safe for javascript string literals (universal)
let js_safe = for_javascript(user_input);
assert!(js_safe.contains(r"<\/script>"));

// safe for quoted CSS string values
let css_safe = for_css_string(user_input);
assert!(css_safe.contains(r"\3c"));

// safe as a URI query parameter value
let uri_safe = for_uri_component(user_input);
assert!(uri_safe.contains("%3C"));

§available contexts

§HTML

functionsafe for
for_htmltext content + quoted attributes
for_html_contenttext content only
for_html_attributequoted attributes only
for_html_unquoted_attributeunquoted attribute values

§XML

functionsafe for
for_xmlXML text content + quoted attributes (alias for for_html)
for_xml_contentXML text content only (alias for for_html_content)
for_xml_attributequoted XML attributes only (alias for for_html_attribute)
for_xml_commentXML comment content
for_cdataCDATA section content

§XML 1.1

functionsafe for
for_xml11XML 1.1 content + quoted attributes
for_xml11_contentXML 1.1 content only
for_xml11_attributeXML 1.1 quoted attributes only

§JavaScript

functionsafe for
for_javascriptgeneral JS string contexts
for_javascript_attributeHTML event attributes
for_javascript_block<script> blocks
for_javascript_sourcestandalone .js files
for_js_templateES6 template literal content (`...`)

§CSS

functionsafe for
for_css_stringquoted CSS string values
for_css_urlCSS url() values

§URI

functionsafe for
for_uri_componentURI components (query params, path segments)

§additional literal contexts

these encoders are not part of the OWASP Java Encoder’s scope. they encode untrusted strings for safe embedding in source code literals.

functionsafe for
for_jsonJSON string values
for_javaJava string / char literals
for_go_stringGo interpreted string literals ("...")
for_go_charGo rune literals ('...')
for_go_byte_stringGo byte-explicit string literals ([]byte("..."))
for_rust_stringRust string literals ("...")
for_rust_charRust char literals ('...')
for_rust_byte_stringRust byte string literals (b"...")
for_ruby_stringRuby double-quoted string literals ("...")
for_python_stringPython string literals ("..." or '...')
for_python_bytesPython bytes literals (b"..." or b'...')
for_python_raw_stringPython raw string literals (r"..." or r'...')
for_sqlStandard SQL string literals ('...')
for_sql_backslashMySQL/MariaDB string literals with backslash escaping ('...')

§security model

this is a contextual output encoder, not a sanitizer. it prevents cross-site scripting by encoding output for specific contexts, but it does not validate or sanitize input.

important caveats:

  • encoding is not sanitization. encoding <script> as &lt;script&gt; makes it display safely in HTML, but does not remove it. if you need to allow a subset of HTML, use a dedicated sanitizer.
  • context matters. using the wrong encoder for a context can leave you vulnerable. for_html_content output is not safe in attributes.
  • tag and attribute names cannot be encoded. never pass untrusted data as a tag name, attribute name, or event handler name. validate these against a whitelist.
  • full URLs must be validated separately. for_uri_component encodes a component, not a full URL. to embed an untrusted URL, validate its scheme and structure first, then encode for the final sink.
  • template literals. the string literal JavaScript encoders do not encode backticks. use for_js_template to embed data directly in ES2015+ template literals.
  • grave accent. unpatched Internet Explorer treats ` as an attribute delimiter. for_html_unquoted_attribute encodes it, but numeric entities decode back to the original character, so this is not a complete fix. avoid unquoted attributes.
  • HTML comments. no HTML comment encoder is provided because HTML comments have vendor-specific extensions (e.g., conditional comments) that make safe encoding impractical. for_xml_comment is for XML comments only.

§writer-based API

every for_* function has a corresponding write_* function that writes to any std::fmt::Write implementor, avoiding allocation when writing to an existing buffer:

use contextual_encoder::write_html;

let mut buf = String::new();
write_html(&mut buf, "safe & sound").unwrap();
assert_eq!(buf, "safe &amp; sound");

§display wrappers

every for_* function also has a corresponding display_* function that returns a zero-allocation Display wrapper. use these when embedding encoded output inline in format! or write!:

use contextual_encoder::display_html;

let user_input = "<script>alert('xss')</script>";
// one allocation (the final String), zero intermediate allocations
let safe = format!("<p>{}</p>", display_html(user_input));
assert!(safe.contains("&lt;script&gt;"));

Re-exports§

pub use css::for_css_string;
pub use css::for_css_url;
pub use css::write_css_string;
pub use css::write_css_url;
pub use display::display_cdata;
pub use display::display_css_string;
pub use display::display_css_url;
pub use display::display_go_byte_string;
pub use display::display_go_char;
pub use display::display_go_string;
pub use display::display_html;
pub use display::display_html_attribute;
pub use display::display_html_content;
pub use display::display_html_unquoted_attribute;
pub use display::display_java;
pub use display::display_javascript;
pub use display::display_javascript_attribute;
pub use display::display_javascript_block;
pub use display::display_javascript_source;
pub use display::display_js_template;
pub use display::display_json;
pub use display::display_python_bytes;
pub use display::display_python_raw_string;
pub use display::display_python_string;
pub use display::display_ruby_string;
pub use display::display_rust_byte_string;
pub use display::display_rust_char;
pub use display::display_rust_string;
pub use display::display_sql;
pub use display::display_sql_backslash;
pub use display::display_uri_component;
pub use display::display_xml;
pub use display::display_xml11;
pub use display::display_xml11_attribute;
pub use display::display_xml11_content;
pub use display::display_xml_attribute;
pub use display::display_xml_comment;
pub use display::display_xml_content;
pub use go::for_go_byte_string;
pub use go::for_go_char;
pub use go::for_go_string;
pub use go::write_go_byte_string;
pub use go::write_go_char;
pub use go::write_go_string;
pub use html::for_html;
pub use html::for_html_attribute;
pub use html::for_html_content;
pub use html::for_html_unquoted_attribute;
pub use html::write_html;
pub use html::write_html_attribute;
pub use html::write_html_content;
pub use html::write_html_unquoted_attribute;
pub use java::for_java;
pub use java::write_java;
pub use javascript::for_javascript;
pub use javascript::for_javascript_attribute;
pub use javascript::for_javascript_block;
pub use javascript::for_javascript_source;
pub use javascript::for_js_template;
pub use javascript::write_javascript;
pub use javascript::write_javascript_attribute;
pub use javascript::write_javascript_block;
pub use javascript::write_javascript_source;
pub use javascript::write_js_template;
pub use json::for_json;
pub use json::write_json;
pub use python::for_python_bytes;
pub use python::for_python_raw_string;
pub use python::for_python_string;
pub use python::write_python_bytes;
pub use python::write_python_raw_string;
pub use python::write_python_string;
pub use ruby::for_ruby_string;
pub use ruby::write_ruby_string;
pub use rust::for_rust_byte_string;
pub use rust::for_rust_char;
pub use rust::for_rust_string;
pub use rust::write_rust_byte_string;
pub use rust::write_rust_char;
pub use rust::write_rust_string;
pub use sql::for_sql;
pub use sql::for_sql_backslash;
pub use sql::write_sql;
pub use sql::write_sql_backslash;
pub use uri::for_uri_component;
pub use uri::write_uri_component;
pub use xml::for_cdata;
pub use xml::for_xml;
pub use xml::for_xml11;
pub use xml::for_xml11_attribute;
pub use xml::for_xml11_content;
pub use xml::for_xml_attribute;
pub use xml::for_xml_comment;
pub use xml::for_xml_content;
pub use xml::write_cdata;
pub use xml::write_xml;
pub use xml::write_xml11;
pub use xml::write_xml11_attribute;
pub use xml::write_xml11_content;
pub use xml::write_xml_attribute;
pub use xml::write_xml_comment;
pub use xml::write_xml_content;

Modules§

css
CSS contextual output encoders.
display
zero-allocation Display wrappers for all encoding contexts.
go
go literal encoders.
html
HTML / XML contextual output encoders.
java
java string literal encoder.
javascript
javascript contextual output encoders.
json
JSON string encoder.
python
python literal encoders.
ruby
ruby literal encoder.
rust
rust literal encoders.
sql
SQL string literal encoders.
uri
URI component encoder.
xml
XML-specific contextual output encoders.