pub struct Rule {Show 14 fields
pub id: String,
pub category: ThreatCategory,
pub severity: Severity,
pub confidence: f32,
pub condition: RuleCondition,
pub action: RecommendedAction,
pub reason: String,
pub shield: Option<ShieldHint>,
pub enabled: bool,
pub tags: Vec<String>,
pub promptintel_threats: Vec<String>,
pub requires_code_artifact: bool,
pub downgrade_when_confirmation_gate: bool,
pub downgrade_when_documentation_context: bool,
}Expand description
A security detection rule
Rules define security patterns to detect in skill documents. Each rule specifies a condition to match, the threat category, severity level, and recommended action when matched.
Rules are typically defined in YAML format and loaded by the super::RuleEngine.
Fields§
§id: StringUnique rule identifier
category: ThreatCategoryThreat category
severity: SeveritySeverity level
confidence: f32Confidence score (0.0 - 1.0)
condition: RuleConditionCondition that triggers the rule
action: RecommendedActionRecommended action
reason: StringHuman-readable reason
shield: Option<ShieldHint>Shield policy hint
enabled: boolWhether the rule is enabled
Tags for filtering
promptintel_threats: Vec<String>Optional list of upstream PromptIntel threat names this rule
covers (e.g. ["Jailbreak", "Hidden instruction in code or comments"]). Used by the promptintel coverage command to
build a per-threat audit table; left empty for rules that do
not target prompt-layer attacks. Validation against the
canonical taxonomy happens in the CLI, not at parse time, so
an upstream rename does not brick rule loading.
requires_code_artifact: boolWhen true, a regex match in the SKILL.md prose body that is
NOT corroborated by an occurrence inside any markdown code
block is downgraded from the rule’s natural action /
signal-class to RequireApproval / ReviewSignal. Used for
vocabulary-only rules (SKILL_PAYMENT_ACCESS,
SKILL_TOKEN_SCAM, …) that legitimately fire on documentation
or coaching skills which only DESCRIBE the pattern they
detect. Cross-LLM triage on a 4000-skill VT-clean corpus
confirmed prose-only matches drive ~30-50 FPs per affected
rule.
Defaults to false — opt-in per rule, never global. The
downgrade applies AFTER the regex matched; matches inside
code blocks (or in any artifact whose MatchTarget is
CodeBlock / ReferencedFile) keep full strength.
downgrade_when_confirmation_gate: boolWhen true, a finding is downgraded if the surrounding
document contains explicit human-in-the-loop confirmation
gate markers (e.g. confirmation_token, “user types YES”,
“two-step gate”, “propose → user”). Used for autonomy /
payment / deferred-execution rules whose risk model assumes
no human gate. Cross-LLM triage on a 4000-skill VT-clean
corpus showed okx-trading-style skills with strict
propose→confirm workflows trip these rules even though the
gate is exactly the safety control the rule was designed to
require.
Defaults to false. Marker list lives in
compiled::CONFIRMATION_GATE_MARKERS and is intentionally
case-insensitive so authors don’t have to predict the exact
phrasing.
downgrade_when_documentation_context: boolWhen true, a finding is downgraded if the document declares
itself as an educational / detection / anti-pattern catalogue
(e.g. ## What it checks, ## Anti-patterns, “this skill
detects”, “examples of bad code”). Used for vocabulary
rules whose patterns appear in security scanners that
document the very behaviours they detect.
Defaults to false. Marker list lives in
compiled::DOCUMENTATION_CONTEXT_MARKERS.