Module scan

Expand description

Repository scanning (dcg scan) for destructive commands.

This module is intentionally extractor-based (not naive substring grep). The core idea is to extract only executable contexts from files, then evaluate extracted commands using the shared evaluator pipeline.

§Extraction contract

Each extractor returns ExtractedCommand entries:

file, line, optional col
extractor_id identifying the execution context (e.g. shell.script)
command (the extracted executable command text)
optional metadata (structured context for debugging / future UX)

Extractors MUST be conservative: if unsure whether something is executed, prefer returning no extraction rather than producing false positives.

§Output schema (v1)

dcg scan --format json emits a ScanReport containing:

stable ordering of findings (deterministic output for CI / PR comments)
decision in {allow,warn,deny}
severity in {info,warning,error}
stable rule_id (pack_id:pattern_name) when available

Note: the shared evaluator currently only blocks deny-by-default pack rules. Scan output uses this evaluator behavior for parity.

Structs§

ExtractedCommand: Extracted executable command from a file.
HooksToml: Project-level scan config for repo integrations (pre-commit/CI).
HooksTomlScan
HooksTomlScanPaths
ScanDecisionCounts: Counts of findings by decision.
ScanEvalContext: Precomputed evaluator context for scanning.
ScanFinding: A scan finding produced by evaluating an extracted command.
ScanOptions: In-memory scan configuration (CLI + defaults).
ScanReport: Complete scan output (stable JSON schema).
ScanSeverityCounts: Counts of findings by severity.
ScanSummary: Summary statistics for a scan run.
SkippedEntry: One file or top-level path that was skipped during scanning, with the reason it was skipped. The reason field is a stable lower-snake-case label so JSON consumers can branch on it.

Enums§

ScanDecision: Scan decision for an extracted command.
ScanFailOn: Controls scan failure behavior (CI integration).
ScanFormat: Scan output format.
ScanRedactMode: Redaction mode for scan output.
ScanSeverity: Scan severity (used for --fail-on policy).
SkipReason: Why a file or path was skipped during scanning. Serialized as lower-snake-case strings for JSON stability.

Constants§

SCAN_HEREDOC_MIN_TIMEOUT_MS: Minimum heredoc-extraction timeout the scan path enforces (milliseconds).
SCAN_SCHEMA_VERSION

Functions§

build_report
evaluate_extracted_command
extract_azure_pipelines_from_str: Extract commands from Azure Pipelines YAML files.
extract_circleci_from_str: Extract commands from CircleCI config files.
extract_docker_compose_from_str: Extract executable commands from docker-compose files.
extract_dockerfile_from_str: Extract commands from Dockerfile RUN instructions
extract_github_actions_workflow_from_str: Extract commands from GitHub Actions workflow run steps
extract_gitlab_ci_from_str
extract_makefile_from_str: Extract commands from Makefile recipe lines
extract_package_json_from_str: Extract executable scripts from package.json.
extract_shell_script_from_str: Extract commands from shell scripts (.sh, .bash files)
extract_terraform_from_str: Extract commands from Terraform provisioner blocks.
parse_hooks_toml: Parse .dcg/hooks.toml and return (typed config, warnings).
redact_aggressively
redact_quoted_strings
scan_paths: Scan file paths (directories are expanded recursively).
scan_paths_with_progress: Scan file paths with optional progress reporting.
should_fail
sort_findings

Type Aliases§

ScanProgressCallback: Progress callback for scan operations.

Module scan

Module scan Copy item path

§Extraction contract

§Output schema (v1)

Structs§

Enums§

Constants§

Functions§

Type Aliases§

Module scan