Module parse

Expand description

Shell AST front-end for the classifier (pure-Rust, via brush-parser).

The Tier-1 classifier historically tokenized the command line with a small hand-rolled splitter. That is fast and dependency-light, but it can’t see the true shell structure — commands hidden inside command substitution $(…) / backticks, here-documents fed to a shell, or unusual quoting. This module parses the line into a real bash AST and flattens it to the list of simple commands it would run, descending into:

pipelines, &&/||/; lists, and compound commands (subshells, groups, if/for/while/case/functions),
command substitutions $(…) and backticks found in any word,
here-document bodies fed to a shell (bash <<EOF … EOF),
the -c script of a shell, and find -exec / xargs payloads.

The classifier composes this with the existing tokenizer pass worst-wins, so the AST can only ever add detections (deeper, obfuscated payloads) and never downgrade a tokenizer verdict. A parse failure yields None — the caller treats that as “the AST found nothing”, and the tokenizer pass (and the fail-toward-caution default) still stands.

Structs§

Analysis: What the AST pass found: every simple command the line would run (flattened, including those nested in substitutions / compounds), plus the raw text of every command substitution $(…) / backtick — so the classifier can also whole-line-scan substitution bodies (e.g. a curl … | sh hidden in $(…)).
SimpleCmd: One simple command extracted from the AST: program plus its argument words. Word text is raw (quotes/expansions preserved), exactly as the agent wrote it — the classifier trims quotes where it matters.

Functions§

analyze: Parse raw into an Analysis. Returns None if the line can’t be parsed (caller falls back to the tokenizer pass + the cautious default).
ast_commands: Just the flattened simple commands (used in tests).

Module parse

Module parse Copy item path

Structs§

Functions§

Module parse