Expand description
Two-tier heredoc and inline script detection.
This module implements a tiered detection architecture for heredoc and inline script analysis, balancing performance with detection accuracy.
§Architecture
Command Input
│
▼
┌─────────────────┐
│ Tier 1: Trigger │ ─── No match ──► ALLOW (fast path)
│ (<100μs) │
└────────┬────────┘
│ Match
▼
┌─────────────────┐
│ Tier 2: Extract │ ─── Error/Timeout ──► ALLOW + warn
│ (<1ms) │
└────────┬────────┘
│ Success
▼
┌─────────────────┐
│ Tier 3: AST │ ─── No match ──► ALLOW
│ (<5ms) │ ─── Match ──► BLOCK
└─────────────────┘§Tier 1: Trigger Detection
Ultra-fast detection using RegexSet for parallel matching.
Zero allocations on non-match path. MUST have zero false negatives.
§Tier 2: Content Extraction
Extracts heredoc/inline script content with bounded memory and time. Graceful degradation on malformed input.
§Tier 3: AST Pattern Matching (future)
Uses ast-grep-core for structural pattern matching. Language-specific patterns for destructive operations.
Structs§
- Extracted
Content - Extracted content from a heredoc or inline script.
- Extracted
Shell Command - Extracted shell command with position info for evaluator integration.
- Extraction
Limits - Limits for content extraction to prevent resource exhaustion.
Enums§
- Detection
Confidence - Confidence level of language detection.
- Extraction
Result - Result of Tier 2 content extraction.
- Heredoc
Type - Type of heredoc extraction.
- Script
Language - Detected language for embedded script content.
- Skip
Reason - Reason why extraction was skipped (for observability/logging).
- Trigger
Result - Result of Tier 1 trigger detection.
Functions§
- check_
binary_ content - Check if content appears to be binary (contains null bytes or high non-printable ratio).
- check_
triggers - Check if a command contains heredoc or inline script indicators.
- extract_
content - Extract heredoc and inline script content from a command.
- extract_
shell_ commands - Extract executable shell commands from heredoc/script content.
- is_
non_ executing_ heredoc_ command - Check if a command executes its heredoc/stdin content as code.
- mask_
non_ executing_ heredocs - Mask heredoc content when the target command doesn’t execute it.
- matched_
triggers - Returns the list of trigger pattern indices that matched.