Skip to main content

Crate wafrift_wafmodel

Crate wafrift_wafmodel 

Source
Expand description

§wafrift-wafmodel — the WAF decompiler

Stop searching a black box. Reconstruct the WAF’s decision boundary as an executable symbolic automaton, then turn evasion from search into deduction:

  • P1 — Decompile. Active-learn the WAF (the learn module) over a WafOracle into an Sfa, spending the minimum membership-query budget. Emit it as a provenance-stamped artifact.
  • P1 — Mine. Intersect the learned pass-language with an attack grammar offline to harvest minimal-edit bypasses with no further live queries.
  • P2 — Solve. Compose the learned WAF view with the pipeline’s normalization transducers and solve for inputs that survive every stage (the double-decode trick, rediscovered — not hard-coded).
  • P3 — Dominate. The same model drives constrained adversarial evasion of ML-WAFs and provable hole-closure for defenders.

Everything here is zero-config and pure-Rust: no GPU, no external Coraza, no network required for the core. Acceleration (vyre/GPU, live HTTP oracles) is strictly additive.

The crate is built bottom-up; each module is landed complete (no stubs) before the next depends on it. This file only declares modules that are fully implemented.

Re-exports§

pub use artifact::LearnedModel;
pub use artifact::Provenance;
pub use booster::WafBoosterScorer;
pub use canon::CanonView;
pub use canon::Channel;
pub use canon::Segment;
pub use canon::canonicalize;
pub use ensemble_dilution::RuleGroup;
pub use equiv_bridge::norm_mismatch_members;
pub use equiv_bridge::sink_for_tag;
pub use equiv_bridge::solution_member;
pub use equiv_query::ChainedEq;
pub use equiv_query::PacBound;
pub use equiv_query::SampledEq;
pub use equiv_query::UcbBanditEq;
pub use equiv_query::WMethodEq;
pub use error::Result;
pub use error::WafModelError;
pub use filter_profile::DecodeGap;
pub use filter_profile::FilterProfile;
pub use filter_profile::TokenFinding;
pub use filter_profile::TokenProbe;
pub use filter_profile::Verdict;
pub use filter_profile::battery_from_toml;
pub use filter_profile::characterize;
pub use filter_profile::default_battery as default_filter_battery;
pub use filter_profile::probe_decode_gaps;
pub use harden::ClosureReport;
pub use harden::synthesize_closure;
pub use learn::Alphabet;
pub use learn::BoundedExhaustiveEq;
pub use learn::EquivalenceOracle;
pub use learn::LearnReport;
pub use learn::kv_learn;
pub use learn::l_star;
pub use learn::l_star_budgeted;
pub use learn::passive_learn;
pub use mine::attack_grammar;
pub use mine::mine_bypasses;
pub use mine::minimal_bypass;
pub use mine::waf_diff;
pub use mlwaf::MlEvasion;
pub use mlwaf::MlWaf;
pub use mlwaf::evade_ml;
pub use normalize::Transform;
pub use normalize::apply_chain;
pub use oracle::ChannelSet;
pub use oracle::FnOracle;
pub use oracle::Rule;
pub use oracle::SimRegexWaf;
pub use oracle::WafOracle;
pub use origin_probe::FnReflector;
pub use origin_probe::OriginScan;
pub use origin_probe::ReflectionOracle;
pub use origin_probe::detect_origin_normalization;
pub use origin_probe::scan_origin;
pub use outcome::Outcome;
pub use sfa::BytePred;
pub use sfa::Sfa;
pub use sfa::StateId;
pub use solve::Solution;
pub use solve::solve_bypass;
pub use transduce::Pipeline;
pub use transduce::Stage;
pub use transduce::json_unescape;
pub use transduce::url_decode_once;

Modules§

artifact
The learned-model artifact: a decompiled WAF, serialized.
booster
WAFBooster importance scoring (paper: “WAFBooster: Automatic Boosting of
canon
Canonicalize a wafrift_types::Request into the ordered set of byte-segments a CRS-class WAF actually inspects.
ensemble_dilution
#101 Multi-sub-score ensemble dilution.
equiv_bridge
Zero-downstream-change bridge: solver/preimage output expressed as the canonical EquivPayload the rest of the ecosystem already consumes.
equiv_query
Query-economical equivalence oracles — the strategies that make decompiling a live WAF affordable.
error
Error type for the WAF-decompilation engine.
filter_profile
Filter characterization — learn WHICH attack tokens a live WAF actually policies, by differential probing.
harden
The defensive dual: from a decompiled WAF’s holes, synthesize the minimal rules that close them, prove zero new false positives against a benign corpus, and prove the class is closed (no attack-grammar member survives the hardened config).
learn
Active automaton learning over the WAF oracle.
mine
Offline bypass mining over the decompiled model.
mlwaf
Constrained black-box evasion of ML-WAFs.
normalize
CRS input transformations — the decoding a ModSecurity/Coraza-class WAF applies to a variable before matching a rule against it (t:urlDecodeUni, t:htmlEntityDecode, t:lowercase, …).
oracle
The oracle the learner queries: does this request reach the app?
origin_probe
Origin-normalization fingerprinting — measure which decode/normalize stages a target’s origin applies, so the P2 solver TARGETS its preimage to the real pipeline instead of speculatively trying every canonical sink.
outcome
The two-valued projection of a WAF verdict that the learner reasons over.
sfa
Symbolic finite automata over the byte domain.
solve
Composition + preimage solver — the part that turns “encoding tricks” from hand-written rules into emergent solutions.
transduce
Pipeline-stage transducers.

Functions§

default_crs_ruleset
The shipped Tier-B OWASP-CRS-derived ruleset (XSS 941 + SQLi 942), embedded so wafrift audit/harden work zero-config with no files to fetch. Parse with SimRegexWaf::from_toml.