Expand description
compression — request-body compression as a WAF-evasion surface.
§The attack
Almost every WAF in production today inspects raw request bytes,
NOT the decompressed payload. The reasoning is operational: a
WAF that decompresses inbound bodies pays the CPU cost of
decompression on every request, and many vendors choose to skip
that — either entirely, or selectively per Content-Encoding
algorithm.
That choice is the seam this module exploits:
Content-Encoding: gzipis the universal case; nearly all WAFs decompress it. Useful as the baseline + as a chain ingredient.Content-Encoding: deflateis RFC-permitted but irregularly supported — many WAFs that handle gzip return 400 on adeflate-coded body. The origin (nginx, IIS, Apache, Node, PHP-FPM, anything using zlib) accepts both.Content-Encoding: br(Brotli) is where the seam is widest. Brotli requires a separate decompressor (not zlib). Many WAFs ship no brotli support at all — they either return 415 (and the operator avoidsbr), or worse, they pass the request through uninspected because their rule engine has nothing to match against. Origins ARE brotli-capable (Chrome 49+, Firefox 44+, nginx 1.11+ with thebrotlimodule). Wrap a payload in brotli and the rule corpus that fires on the plain payload bytes never gets a chance to match.
§Chained encoding
Encoding-chain attacks add layers (e.g. gzip → base64 → urlenc).
The WAF, which normalises only a fixed number of decode passes
(usually 1, sometimes 2), stops short of the original payload —
while the origin’s parser stack (which decodes more layers as
Content-Type / Content-Encoding direct) reaches it. chain is
the primitive for this attack.
§Pristine code
- Every public function returns
Result<_, CompressionError>— nounwrap()reachable on bad input. - The chain function caps at 16 layers so a misconfiguration
(
gzip,gzip,gzip,...) can’t run away. - Empty body is permitted and returns the compressor’s idempotent marker (gzip has a 10-byte header even for empty input, brotli is similar).
- No allocation beyond what each encoder requires; the public API takes a borrowed slice, not an owned Vec.
Structs§
- Compressed
Body - A compressed body with its
Content-Encodingheader value. The caller writes the body bytes onto the wire verbatim and sets the header — both are required, and a mismatched pairing is a debugging nightmare for the operator if we let it happen.
Enums§
- Algorithm
- One compression algorithm. The naming matches the HTTP
Content-Encodingregistry value (lowercase, no padding). - Compression
Error - Errors raised by the compression-confusion API. Wraps the underlying encoder failures (rare for in-memory operations) plus the chain-depth cap.
Constants§
- DECOMPRESSED_
BODY_ MAX_ BYTES - Hard cap on decoded body size — defends against decompression bombs. A 1 KB malicious gzip can decompress to 10+ GB if read without bounds.
- MAX_
CHAIN_ LAYERS - Hard cap on
chainlayers — any longer is almost certainly a misconfiguration, and the compressed-output size would balloon from header overhead per layer. 16 is generous: real attacks use 2–3 layers.
Functions§
- chain
- Apply a sequence of compression algorithms in order, producing
one set of body bytes + the joint
Content-Encodingheader. - compress
- Compress
bodywith a single algorithm. Returns the raw compressed bytes + the matchingContent-Encodingheader value. - decompress
- Recover the original bytes from a
CompressedBody— the inverse ofcompress/chain. Test-only and audit helper; production attack flow only needs the compress direction.