pub fn html_entity_variants(payload: &str) -> StringExpand description
HTML entity encoding with per-character variant rotation.
Cycles each character through four browser-tolerant forms that strict
WAF regexes (which typically anchor on &#x[0-9a-f]+; with a lowercase
x and required ;) miss:
&#xHH;— canonical lowercase-x hex&#XHH;— uppercase-X hex (browsers accept; case-sensitive regex misses)&#DD;— decimal�DD;— decimal with leading zeros (HTML5 spec allows arbitrary leading zeros)
Rotation is by character index (deterministic; same input always produces the same output — important for proptest idempotency).
Bypass mechanism: a ModSecurity regex like
@rx &#x([0-9a-f]+);.*&#x([0-9a-f]+); won’t match a payload of
<<s> (the same <s payload routed through all
four variants). The browser decodes all four; the regex anchored on
the canonical form sees a different shape.
Context: HTML body / attribute. Equivalent to html_entity /
html_entity_decimal for browser decoding; safer against
canonicalising WAFs that strip the trailing ; only on the lowercase
form.