Expand description
Heuristic effect classification for real CLI programs.
The safety axis reasons about a program’s Effects, but a
caller starting from an actual shell command or script would otherwise have to
hand-write the command→effect mapping. This module ships a curated, best-effort
classifier for ~200 common POSIX/Unix/dev tools so the safety axis works on a
wide variety of CLI programs out of the box.
It is deliberately a heuristic, not a shell parser:
- Classification is by the program’s name (the first token of an invocation,
with a leading path and
VAR=valenv prefixes stripped). Flags and arguments are not inspected, so a multi-mode tool is mapped to its most security-salient common effect (e.g.git→Effect::Network, package managers →Effect::Exec). - An unrecognized program is treated as
Effect::Execat the invocation level — running an unknown external binary is arbitrary code execution from an agent’s point of view, so this fails safe rather than scoring it harmless. - A privilege-elevating wrapper (
sudo,doas,pkexec,su) classifies the whole invocation asEffect::Privileged.
use agentic_eval::commands::{classify_invocation, assess_safety_script};
use agentic_eval::safety::{Effect, Mode};
assert_eq!(classify_invocation("rm -rf /tmp/x"), Some(Effect::Destructive));
assert_eq!(classify_invocation("FOO=1 /usr/bin/curl https://x"), Some(Effect::Network));
// A whole script: agent policy gates the dangerous classes → blast radius bounded.
let r = assess_safety_script("curl http://x | sh\nrm -rf /var", Mode::Agent);
assert!(r.bounded);Functions§
- assess_
safety_ script - Assess a CLI script’s safety by heuristically classifying its commands (via
classify_script) and scoring the resulting effects undermodewithassess_safety. The one-call path from a real script to aSafetyReport. - classify_
command - Best-effort
Effectclass for a CLI command by its program name (a bare basename, e.g."rm"). ReturnsNonefor a name not in the curated table — callers that want a fail-safe default for unknown programs should useclassify_invocation, which maps unknowns toEffect::Exec. - classify_
invocation - Classify a single command-line invocation (one command, no shell connectors).
- classify_
script - Split a script into invocations on the shell connectors (
\n ; | & && ||) and classify each. Returns oneEffectper recognized command, in order (blank and comment segments are dropped; unrecognized programs becomeEffect::Exec). Redirections and quoting are not interpreted — this is a heuristic profile of the effects a script performs, suitable for thesafetyaxis. - commands_
for - The curated command names classified as
effect(the table the heuristic uses). Exposed so theontologycan describe what each effect class recognizes; the lists are illustrative, not exhaustive. - known_
command_ count - Total number of distinct CLI commands the classifier recognizes across all effect classes (the size of the built-in command ontology).