Skip to main content

Module normalizer

Module normalizer 

Source
Expand description

Pre-processing deobfuscation normalizer.

Runs before Stage 1 (propose) to catch encoding-evasion attacks that the LLM would not flag because the surface text looks innocuous.

Seven passes in sequence: 0. BiDi control strip — invisible directional override chars

  1. Fullwidth normalize — A..Z, a..z, 0..9 → ASCII
  2. Backslash unescape — \M\y\ \k\e\y → My key
  3. Base64 decode — b64.decode(“…”) and bare base64 chunks
  4. Morse code decode — …. .- -.-. -.- / -.-. .- - → HACK CAT
  5. Homoglyph replace — Cyrillic/Greek confusables → ASCII
  6. Script interference — per-char script-ID forward-vs-reversed diff
  7. Leetspeak normalize — 0→o 1→i 3→e 4→a 5→s @→a !→i within heavy-leet tokens

The normalized text is fed to Stage 1. Detections are merged into the harness trace and consistency flags.

Structs§

Detection
NormalizationResult

Enums§

DetectionKind

Functions§

run
Run all normalizer passes over input and return the cleaned text plus a list of every detected obfuscation event.
summary