Module heredoc

Expand description

Two-tier heredoc and inline script detection.

This module implements a tiered detection architecture for heredoc and inline script analysis, balancing performance with detection accuracy.

§Architecture

Command Input
     │
     ▼
┌─────────────────┐
│ Tier 1: Trigger │ ─── No match ──► ALLOW (fast path)
│   (<100μs)      │
└────────┬────────┘
         │ Match
         ▼
┌─────────────────┐
│ Tier 2: Extract │ ─── Error/Timeout ──► ALLOW + warn
│   (<1ms)        │
└────────┬────────┘
         │ Success
         ▼
┌─────────────────┐
│ Tier 3: AST     │ ─── No match ──► ALLOW
│   (<5ms)        │ ─── Match ──► BLOCK
└─────────────────┘

§Tier 1: Trigger Detection

Ultra-fast detection using RegexSet for parallel matching. Zero allocations on non-match path. MUST have zero false negatives.

§Tier 2: Content Extraction

Extracts heredoc/inline script content with bounded memory and time. Graceful degradation on malformed input.

§Tier 3: AST Pattern Matching (future)

Uses ast-grep-core for structural pattern matching. Language-specific patterns for destructive operations.

Structs§

ExtractedContent: Extracted content from a heredoc or inline script.
ExtractedShellCommand: Extracted shell command with position info for evaluator integration.
ExtractionLimits: Limits for content extraction to prevent resource exhaustion.

Enums§

DetectionConfidence: Confidence level of language detection.
ExtractionResult: Result of Tier 2 content extraction.
HeredocType: Type of heredoc extraction.
ScriptLanguage: Detected language for embedded script content.
SkipReason: Reason why extraction was skipped (for observability/logging).
TriggerResult: Result of Tier 1 trigger detection.

Functions§

check_binary_content: Check if content appears to be binary (contains null bytes or high non-printable ratio).
check_triggers: Check if a command contains heredoc or inline script indicators.
extract_content: Extract heredoc and inline script content from a command.
extract_shell_commands: Extract executable shell commands from heredoc/script content.
is_non_executing_heredoc_command: Check if a command executes its heredoc/stdin content as code.
mask_non_executing_heredocs: Mask heredoc content when the target command doesn’t execute it.
matched_triggers: Returns the list of trigger pattern indices that matched.

Module heredoc

Module heredoc Copy item path

§Architecture

§Tier 1: Trigger Detection

§Tier 2: Content Extraction

§Tier 3: AST Pattern Matching (future)

Structs§

Enums§

Functions§

Module heredoc