Expand description
Shell AST front-end for the classifier (pure-Rust, via brush-parser).
The Tier-1 classifier historically tokenized the command line with a small
hand-rolled splitter. That is fast and dependency-light, but it can’t see the
true shell structure — commands hidden inside command substitution $(…) /
backticks, here-documents fed to a shell, or unusual quoting. This module
parses the line into a real bash AST and flattens it to the list of simple
commands it would run, descending into:
- pipelines,
&&/||/;lists, and compound commands (subshells, groups,if/for/while/case/functions), - command substitutions
$(…)and backticks found in any word, - here-document bodies fed to a shell (
bash <<EOF … EOF), - the
-cscript of a shell, andfind -exec/xargspayloads.
The classifier composes this with the existing tokenizer pass worst-wins,
so the AST can only ever add detections (deeper, obfuscated payloads) and
never downgrade a tokenizer verdict. A parse failure yields None — the
caller treats that as “the AST found nothing”, and the tokenizer pass (and
the fail-toward-caution default) still stands.
Structs§
- Analysis
- What the AST pass found: every simple command the line would run (flattened,
including those nested in substitutions / compounds), plus the raw text of
every command substitution
$(…)/ backtick — so the classifier can also whole-line-scan substitution bodies (e.g. acurl … | shhidden in$(…)). - Simple
Cmd - One simple command extracted from the AST: program plus its argument words. Word text is raw (quotes/expansions preserved), exactly as the agent wrote it — the classifier trims quotes where it matters.
Functions§
- analyze
- Parse
rawinto anAnalysis. ReturnsNoneif the line can’t be parsed (caller falls back to the tokenizer pass + the cautious default). - ast_
commands - Just the flattened simple commands (used in tests).