# openclaw-scan (ocls) — Implementation Plan
## Audit of Existing Repo (`victor.artificial/openclaw-security-scanner`)
**Language:** Go 1.21
**State:** ~6,690 lines, largely stub/mock implementations. Every actual scanner function returns `[]string{}`, `return false`, or a hard-coded dummy. The code went through 3 "epics" of refactoring but was never wired to real Claude Code paths.
### What to salvage (concepts, not code)
| Element | Source file | How we use it |
|---|---|---|
| `SecurityIssue` struct shape (ID, Severity, Title, Description, Recommendation, Fixable, Path, Category) | `backup_old_code/security/config.go` | Translate directly to Rust `Finding` |
| `CategoryResult` (status/score/issues) | `internal/scanner/scanner.go` | Translate to Rust `CategoryReport` |
| Scoring formula: critical=30pts, high=20pts, medium=10pts, low=5pts | `scanner.go` | Use as starting baseline (our plan uses 25/12/5/2 — keep ours, it's more granular) |
| Exit codes: 0=ok, 1=warnings, 2=critical | `backup_old_code/scanner/main.go` | Keep exactly |
| CLI flags: `--format`, `--verbose`, `--check`, `--fix` | `main.go` | Adapt as `--json`, `--verbose`, `--category`, drop `--fix` |
| `isSecureFilePermissions()` using `mode & 0o077 == 0` | `config.go` | Same logic in Rust with `std::os::unix::fs::PermissionsExt` |
| `expandPath()` for `~/` expansion | `config.go` | Use `dirs::home_dir()` in Rust |
| Makefile cross-platform build targets | `Makefile` | Port to Makefile wrapping `cargo` |
| CI pipeline structure: lint→security→test→build | `.gitlab-ci.yml` | Strip Node/Docker stages; adapt for Rust/cargo |
| Behavioral test pattern (dangerous inputs → should fail) | `security_test.go` | Same pattern in Rust `#[test]` |
### What to discard entirely
- `backup_old_code/security/gateway.go` — zero-trust web gateway, wrong product
- All mock functions (`getOutdatedPackages()`, `getVulnerablePackages()`, `findWorldWritableFiles()` etc.)
- System-level checks: SSH config, firewall, `/etc/passwd`, `/etc/shadow` — irrelevant to Claude Code
- Incorrect config detection (looks for `openclaw.conf`, `/etc/openclaw/` — these don't exist)
- Node.js lint/audit CI stages (no JS in this project)
- Docker build/container scan CI stages
- `crypto/` and `audit/` packages (logging/crypto dependency analysis — not useful here)
- `--fix` auto-remediation (modifying user's Claude config files is dangerous for a security tool)
- Duplicate `SecurityIssue` struct definitions across 4 packages
### Critical gap in existing code
The repo **never correctly identified the actual scan target**. It checks generic OS security (SSH, firewall, system files) instead of `~/.claude/`. Our tool fixes this from the ground up.
---
## Context
OpenClaw is an agentic AI framework that supports multiple LLM backends — Anthropic Claude, OpenAI GPT-4/o-series, Mistral, xAI/Grok, and any model reachable via OpenRouter. Users configure it once and then run agents for weeks or months, gradually accumulating security debt: API keys pasted into conversations, overly broad tool permissions granted out of convenience, MCP/plugin integrations added without review, and sensitive data left behind in history, logs, and plan files — regardless of which model provider is in use.
The threat is **framework-level**, not model-level. A user who switches from Claude to GPT-4 via OpenRouter carries the same risk surface: their credentials, permissions, and stored data all live in the OpenClaw config directory.
**Goal:** A lightweight, blazing-fast Rust CLI (`ocls`) that scans a user's OpenClaw installation and delivers an actionable security report — overall score, category breakdowns, per-finding remediation — without sending any data anywhere. Works for all supported model backends.
---
## Language & Stack: Rust
**Why Rust over Go (the existing repo's language):**
- The existing Go code is ~100% mocks — there is no real logic to port, so language switching costs nothing
- Memory-safe credential handling: no UAF/buffer overflow when parsing tokens and secrets from binary files
- Single static binary with no runtime: `curl | sh` install or Homebrew, no `go` toolchain required on user machines
- Proven CLI quality bar: ripgrep, fd, bat all use Rust — these are the tools we're benchmarking against
- `cargo test --all`, `cargo clippy -- -D warnings`, `cargo fmt --check` enforce quality gates automatically for OSS contributors
- Stronger compile-time correctness guarantees when building complex regex patterns over sensitive credential data
- The `regex` crate uses a linear-time finite automaton — critical for scanning large history files safely without catastrophic backtracking
**Crates (pinned in Cargo.toml):**
```toml
clap = { version = "4", features = ["derive"] }
anyhow = "1"
thiserror = "1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
regex = "1"
walkdir = "2"
owo-colors = "3"
indicatif = "0.17"
rayon = "1"
once_cell = "1"
dirs = "5"
chrono = { version = "0.4", features = ["serde"] }
toml = "0.8"
```
Dev deps: `tempfile = "3"`, `assert_cmd = "2"`, `predicates = "3"`
---
## Repository Structure
```
Openclaw_Security_Tooling_CLI/
├── Cargo.toml
├── Cargo.lock
├── README.md
├── CONTRIBUTING.md
├── .github/
│ └── workflows/
│ ├── ci.yml # test + clippy + fmt check on every PR
│ └── release.yml # publish binary on tag push
├── src/
│ ├── main.rs # CLI entry, error formatting, exit codes
│ ├── cli.rs # clap Arg structs (derive API)
│ ├── paths.rs # Discover ~/.claude + ./.claude paths
│ ├── finding.rs # Finding, Severity, Category types
│ ├── report.rs # Aggregate findings → score + grade
│ ├── scanner/
│ │ ├── mod.rs # Scanner trait; run_all() orchestrator
│ │ ├── config.rs # Configuration security
│ │ ├── secrets.rs # Secret/credential detection
│ │ ├── permissions.rs # File permission auditing
│ │ ├── network.rs # Network/MCP endpoint security
│ │ ├── dependencies.rs # Plugin/MCP server supply chain
│ │ ├── hooks.rs # Hook injection analysis
│ │ └── history.rs # History & data exposure
│ └── output/
│ ├── mod.rs # Dispatch to terminal vs json
│ ├── terminal.rs # Rich colored table output
│ └── json.rs # Serializable report structs
└── tests/
├── integration/
│ ├── config_test.rs
│ ├── secrets_test.rs
│ ├── permissions_test.rs
│ ├── network_test.rs
│ ├── hooks_test.rs
│ └── history_test.rs
└── fixtures/
├── settings_overly_permissive.json
├── settings_secure.json
├── credentials_sample.json # fake tokens only
├── history_with_secrets.jsonl
├── history_clean.jsonl
└── CLAUDE_with_secrets.md
```
---
## Core Types (`src/finding.rs`)
```rust
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
pub enum Severity { Critical, High, Medium, Low, Info }
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum Category {
ConfigSecurity,
SecretDetection,
FilePermissions,
NetworkSecurity,
DependencySecurity,
HookSecurity,
DataExposure,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Finding {
pub severity: Severity,
pub category: Category,
pub title: String,
pub description: String,
pub path: PathBuf,
pub line: Option<usize>,
pub evidence: Option<String>, // redacted snippet (e.g. "ghp_****")
pub remediation: String,
}
```
---
## Path Discovery (`src/paths.rs`)
The tool is **framework-agnostic** — no hardcoded paths. Path resolution priority:
1. `--path <dir>` CLI argument (explicit, highest priority)
2. `OPENCLAW_HOME` environment variable
3. Auto-detect by probing common locations in order:
- `~/.claude/` (Claude Code)
- `~/.openclaw/`
- `~/.config/openclaw/`
- `./` (current directory, if it contains a known marker file)
4. If nothing found → clear error: `"No agentic framework installation found. Use --path <dir> to specify."`
```rust
pub struct InstallRoot {
pub path: PathBuf,
pub framework: FrameworkHint, // DetectedClaudeCode | DetectedOpenclaw | Unknown
}
pub enum FrameworkHint { ClaudeCode, Openclaw, Unknown }
pub fn resolve(explicit: Option<PathBuf>) -> anyhow::Result<InstallRoot>
```
Scanners operate on `InstallRoot.path` — they look for well-known sub-paths (`settings.json`, `history.jsonl`, `credentials.json`, etc.) relative to the root. If a sub-path doesn't exist, the scanner skips it gracefully.
---
## Scanner Trait (`src/scanner/mod.rs`)
```rust
pub trait Scanner: Send + Sync {
fn name(&self) -> &'static str;
fn scan(&self, ctx: &ScanContext) -> anyhow::Result<Vec<Finding>>;
}
pub struct ScanContext {
pub root: PathBuf, // resolved install root (any framework)
pub framework: FrameworkHint, // informational only
pub extra_paths: Vec<PathBuf>, // additional dirs passed via --path (repeatable)
}
pub fn run_all(ctx: &ScanContext) -> Vec<Finding> {
let scanners: Vec<Box<dyn Scanner>> = vec![
Box::new(ConfigScanner),
Box::new(SecretsScanner),
Box::new(PermissionsScanner),
Box::new(NetworkScanner),
Box::new(DependencyScanner),
Box::new(HookScanner),
Box::new(HistoryScanner),
];
scanners.par_iter() // rayon parallel
.flat_map(|s| s.scan(ctx).unwrap_or_default())
.collect()
}
```
---
## Scanner Implementations
### 1. `config.rs` — Configuration Security
Rules:
- Parse `settings.json` and `settings.local.json`
- CRITICAL if `allow: ["Bash(*)"]` or `allow: ["Bash(rm -rf*)"]` found
- HIGH if `allow` contains any `Bash(...)` with shell metacharacters (`*`, `|`, `;`, `$`)
- HIGH if no `deny` rules present at all
- HIGH if `dangerouslySkipPermissions: true` in any config
- MEDIUM if MCP server has `alwaysAllow: true`
- LOW if `model` field uses deprecated model ID
- Scan CLAUDE.md files for hardcoded secrets (delegates to SecretsScanner patterns)
### 2. `secrets.rs` — Secret Detection
Compile all regexes with `once_cell::sync::Lazy`. Scan: `history.jsonl`, `debug/`, `plans/`, `todos/`, `file-history/`, `shell-snapshots/`, `CLAUDE.md` files, `.env*` files.
Key patterns (model-agnostic — covers all major AI providers + infrastructure):
```rust
// ── AI Provider Keys ────────────────────────────────────────────
// Anthropic / Claude
r"sk-ant-(?:api|oat|ort)\d+-[A-Za-z0-9_-]+"
// OpenAI
r"sk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}"
r"sk-proj-[A-Za-z0-9_-]{50,}"
// Mistral
r"[A-Za-z0-9]{32}(?:[A-Za-z0-9]{8})" // Mistral keys (32-40 char random)
// xAI / Grok
r"xai-[A-Za-z0-9_-]{32,}"
// OpenRouter
r"sk-or-v1-[A-Za-z0-9_-]{64}"
// Google Gemini / Vertex
r"AIza[0-9A-Za-z\-_]{35}"
// Cohere
r"[A-Za-z0-9]{40}" // combined with entropy check
// Hugging Face
r"hf_[A-Za-z0-9]{34}"
// ── Cloud / Infrastructure ───────────────────────────────────────
// AWS
r"(?:A3T[A-Z0-9]|AKIA|ASIA|ABIA|ACCA)[A-Z0-9]{16}"
// GitHub
r"ghp_[0-9a-zA-Z]{36}"
r"gho_[0-9a-zA-Z]{36}"
r"github_pat_[0-9a-zA-Z_]{82}"
// GitLab
r"glpat-[0-9a-zA-Z\-_]{20}"
// ── Credentials & Keys ──────────────────────────────────────────
// Private keys
r"-----BEGIN (?:RSA |DSA |EC |OPENSSH )?PRIVATE KEY"
// JWT tokens
r"eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+"
// DB connection strings
r"(?i)(?:postgres|mysql|mongodb|redis)://[^:]+:[^@\s]+@"
// Generic high-entropy secret (≥32 alphanum chars after key=, token=, etc.)
r#"(?i)(?:api[_-]?key|secret|token|password|passwd|pwd)\s*[=:]\s*['"]?([A-Za-z0-9/+]{32,})['"]?"#
```
Evidence is always **redacted** before display: show first 6 chars + `****`.
Severity: CRITICAL for private keys and Anthropic tokens; HIGH for all others.
### 3. `permissions.rs` — File Permissions
Use `std::os::unix::fs::PermissionsExt`:
- CRITICAL if `.credentials.json` has world-readable bits (mode & 0o004 != 0)
- HIGH if `.credentials.json` is group-readable (mode & 0o040 != 0)
- HIGH if `backups/` directory is world-readable
- HIGH if `history.jsonl` is world-readable
- MEDIUM if `settings.json` is world-writable
- LOW if `~/.claude/` directory is group-executable
### 4. `network.rs` — Network Security
Parse `settings.json` for MCP server entries, `.claude/` configs:
- HIGH if any MCP server URL uses `http://` (not localhost)
- MEDIUM if MCP server URL uses `http://localhost` with token in env var
- MEDIUM if OAuth discovery endpoint doesn't match expected provider (Figma, Linear, Sentry)
- LOW if any MCP server URL points to IP address instead of hostname
- INFO: list all external MCP endpoints for manual review
### 5. `dependencies.rs` — Dependency Security
Parse `installed_plugins.json` and `blocklist.json`:
- HIGH if a currently installed plugin appears in blocklist
- HIGH if plugin hash doesn't match expected (tamper detection via stored SHA)
- MEDIUM if plugin hasn't been updated in >90 days
- LOW if plugin from unofficial marketplace source
- INFO: List all installed plugins with install dates
### 6. `hooks.rs` — Hook Security
Scan hook configurations in `settings.json` and project settings:
- CRITICAL if hook uses `--dangerously-skip-permissions`
- HIGH if hook script executes arbitrary shell expansion (`$VAR`, backtick, `$(...)`)
- HIGH if hook sends data to external URL (curl/wget to non-localhost)
- MEDIUM if hook lacks input sanitization (accepts raw tool output without quoting)
- LOW if hook script is world-writable
### 7. `history.rs` — Data Exposure
- Run SecretsScanner patterns against `history.jsonl` specifically
- HIGH if `history.jsonl` > 10MB (excessive data retention)
- MEDIUM if `debug/` directory > 50MB
- MEDIUM if backup files older than 30 days still present (>5 backups)
- LOW if `telemetry/` contains device UUIDs that haven't been rotated
- INFO: report total data retention volume
---
## Scoring (`src/report.rs`)
```rust
fn score(findings: &[Finding]) -> u32 {
let penalty: u32 = findings.iter().map(|f| match f.severity {
Severity::Critical => 25,
Severity::High => 12,
Severity::Medium => 5,
Severity::Low => 2,
Severity::Info => 0,
}).sum();
100u32.saturating_sub(penalty)
}
fn grade(score: u32) -> char {
match score {
90..=100 => 'A',
75..=89 => 'B',
60..=74 => 'C',
40..=59 => 'D',
_ => 'F',
}
}
```
Category sub-scores: same formula applied per-category independently.
---
## CLI Interface (`src/cli.rs`)
```
ocls [OPTIONS] [PATH]
Arguments:
[PATH] Path to agentic framework installation to scan.
Auto-detects ~/.claude/, ~/.openclaw/, or $OPENCLAW_HOME if omitted.
Can be repeated: ocls ~/.openclaw ~/projects/my-agent/.claude
Options:
-j, --json Output machine-readable JSON
-q, --quiet Suppress banner and summary; findings only
-v, --verbose Show remediation and evidence per finding
--no-color Disable ANSI colors
--category <CAT> Scan only: config|secrets|permissions|network|deps|hooks|history
--min-severity <SEV> Minimum severity to show [default: low]
--ignore-path <GLOB> Exclude path from scan (repeatable)
--config <FILE> Suppression/config file (.ocls.toml)
-h, --help
-V, --version
```
Exit codes: `0` = no findings above threshold, `1` = findings present, `2` = scan error.
---
## Terminal Output Design
```
╔════════════════════════════════════════════════════╗
║ openclaw-scan v0.1.0 • Claude Code Security ║
╚════════════════════════════════════════════════════╝
Scanning ~/.openclaw [auto-detected · 847 files]
FINDINGS ──────────────────────────────────────────────
● CRITICAL [Secrets] AWS key in history.jsonl:234
● HIGH [Config] Bash(*) allow rule in settings.json
● HIGH [Permissions] .credentials.json readable (644)
● MEDIUM [Network] HTTP MCP endpoint in settings.json
● LOW [Dependencies] Plugin outdated by 90+ days
SUMMARY ───────────────────────────────────────────────
Score 67 / 100 Grade: C
Configuration ████████░░ 78 2 high 1 medium
Secrets ██████░░░░ 55 1 critical
Permissions ███████░░░ 70 1 high
Network ████████░░ 82 1 medium
Dependencies █████████░ 92 1 low
Hooks ██████████ 100 —
Data Exposure ████████░░ 80 1 medium
5 findings (1 critical · 3 high · 2 medium · 2 low)
Run `ocls -v` for remediation steps.
Run `ocls --json` for machine-readable output.
```
---
## Testing Strategy
- **Unit tests**: In each `scanner/*.rs` module — test each rule in isolation with fixture strings
- **Integration tests**: `tests/integration/` — run full scanner against fixture directories using `tempfile`
- **Fixture files**: `tests/fixtures/` — contain realistic-but-fake credential patterns (fake tokens only)
- **Coverage target**: 100% line coverage via `cargo-llvm-cov`
- **CI enforcement**: `cargo test --all`, `cargo clippy -- -D warnings`, `cargo fmt --check`
- **No false positives rule**: every fixture tested both ways (should-detect AND should-not-detect)
---
## Build Sequence
1. `Cargo.toml` + dependencies
2. `src/finding.rs` + `src/report.rs` — core types, scoring, grading
3. `src/paths.rs` — framework-agnostic path resolution + auto-detection
4. `src/scanner/mod.rs` — `Scanner` trait + parallel orchestrator
5. Scanners in value order:
- `secrets.rs` — all AI provider + infra key patterns
- `permissions.rs` — file permission audit
- `config.rs` — settings/permissions analysis
- `network.rs` — MCP/endpoint security
- `hooks.rs` — hook injection analysis
- `dependencies.rs` — plugin supply chain
- `history.rs` — data retention + exposure
6. `src/output/terminal.rs` + `src/output/json.rs`
7. `src/cli.rs` + `src/main.rs`
8. `tests/fixtures/` + unit + integration tests for every scanner
9. `README.md`, `CONTRIBUTING.md`, `Makefile`, CI workflows (`.github/workflows/`)
---
## Verification
```bash
# Build and test
cargo build
cargo test --all
cargo clippy -- -D warnings
cargo fmt --check
# Coverage
cargo llvm-cov --all-features --workspace
# Manual smoke test
./target/debug/ocls ~/.claude
./target/debug/ocls --json | jq .
./target/debug/ocls --min-severity critical -v
# Integration test against live ~/.claude
cargo test -- --test-threads=1
```