Expand description
Pre-processing deobfuscation normalizer.
Runs before Stage 1 (propose) to catch encoding-evasion attacks that the LLM would not flag because the surface text looks innocuous.
Seven passes in sequence: 0. BiDi control strip — invisible directional override chars
- Fullwidth normalize — A..Z, a..z, 0..9 → ASCII
- Backslash unescape — \M\y\ \k\e\y → My key
- Base64 decode — b64.decode(“…”) and bare base64 chunks
- Morse code decode — …. .- -.-. -.- / -.-. .- - → HACK CAT
- Homoglyph replace — Cyrillic/Greek confusables → ASCII
- Script interference — per-char script-ID forward-vs-reversed diff
- Leetspeak normalize — 0→o 1→i 3→e 4→a 5→s @→a !→i within heavy-leet tokens
The normalized text is fed to Stage 1. Detections are merged into the harness trace and consistency flags.