agent-shield 0.8.0

Security scanner for AI agent extensions — offline-first, multi-framework, SARIF output
Documentation
# AGENTS.md

This file provides guidance to Codex when working with this repository.

## Project Overview

**AgentShield** is a Rust-based, offline-first security scanner for AI agent extensions
(MCP servers, OpenClaw skills, LangChain tools). It produces SARIF output compatible
with GitHub Code Scanning.

## Repository Structure

```
agentshield/
├── src/
│   ├── lib.rs                    # Public API: scan(), render_report()
│   ├── error.rs                  # ShieldError (thiserror)
│   ├── bin/cli.rs                # Clap CLI: scan, list-rules, init
│   ├── ir/                       # Intermediate Representation (ScanTarget)
│   │   ├── mod.rs                # ScanTarget, Framework, SourceFile, ArgumentSource
│   │   ├── tool_surface.rs       # Tool definitions, permissions
│   │   ├── execution_surface.rs  # Commands, file IO, network IO
│   │   ├── data_surface.rs       # Sources, sinks, taint paths
│   │   ├── dependency_surface.rs # Dependencies, lockfiles
│   │   └── provenance_surface.rs # Author, repo, license
│   ├── adapter/                  # Framework → IR (3-phase pipeline)
│   │   ├── mod.rs                # Adapter trait, auto_detect_and_load(root, ignore_tests)
│   │   ├── mcp.rs                # MCP server adapter + is_test_file() + shared helpers
│   │   ├── openclaw.rs           # OpenClaw SKILL.md adapter
│   │   ├── crewai.rs             # CrewAI adapter (BaseTool, @tool)
│   │   └── langchain.rs          # LangChain adapter (@tool, BaseTool, langgraph)
│   ├── parser/                   # Language parsers
│   │   ├── mod.rs                # Parser trait, ParsedFile, FunctionDef, CallSite
│   │   ├── python.rs             # tree-sitter Python + regex patterns
│   │   ├── typescript.rs         # tree-sitter TypeScript/TSX + regex fallback
│   │   ├── shell.rs              # Regex-based shell parser
│   │   └── json_schema.rs        # JSON Schema → ToolSurface
│   ├── analysis/                 # Static analysis
│   │   ├── mod.rs                # Module exports
│   │   ├── capability.rs         # Capability escalation scoring
│   │   ├── cross_file.rs         # Cross-file sanitizer-aware validation (v0.2.2)
│   │   └── supply_chain.rs       # Typosquat detection
│   ├── rules/                    # Detection engine
│   │   ├── mod.rs                # RuleEngine, Detector trait
│   │   ├── finding.rs            # Finding, Severity, Evidence structs
│   │   ├── registry.rs           # Rule metadata registry
│   │   ├── policy.rs             # Policy evaluation (.agentshield.toml)
│   │   └── builtin/              # 12 built-in detectors (SHIELD-001..012)
│   ├── output/                   # Report formatters
│   │   ├── mod.rs                # OutputFormat enum, render()
│   │   ├── console.rs            # Plain text
│   │   ├── json.rs               # JSON
│   │   ├── sarif.rs              # SARIF 2.1.0
│   │   └── html.rs               # Self-contained HTML
│   └── config/                   # .agentshield.toml parsing (policy + scan sections)
├── tests/fixtures/               # Test fixtures (safe + vulnerable)
│   ├── mcp_servers/
│   │   ├── safe_calculator/      # Zero-finding baseline
│   │   ├── safe_filesystem/      # Cross-file validation test (v0.2.2)
│   │   ├── vuln_cmd_inject/      # SHIELD-001 true positive
│   │   ├── vuln_ssrf/            # SHIELD-003 true positive
│   │   └── vuln_cred_exfil/      # SHIELD-002 true positive
│   ├── crewai_project/           # CrewAI adapter test (v0.2.4)
│   └── langchain_project/       # LangChain adapter test (v0.2.4)
├── vscode/                       # VS Code extension (v0.1.0)
│   ├── package.json              # Extension manifest
│   ├── tsconfig.json             # TypeScript config
│   └── src/                      # Extension source (TypeScript)
│       ├── extension.ts          # Activate, commands, auto-scan
│       ├── scanner.ts            # Spawn binary, parse JSON
│       ├── diagnostics.ts        # Finding → vscode.Diagnostic
│       └── types.ts              # JSON interfaces (mirrors Rust)
├── .github/workflows/
│   ├── ci.yml                    # Test + clippy + fmt + smoke
│   └── release.yml               # 5-platform binary builds
└── action.yml                    # GitHub Action (composite)
```

## Common Commands

```bash
# Build
cargo build --release

# Test (95 tests)
cargo test

# Lint
cargo clippy -- -D warnings
cargo fmt --check

# Run CLI
cargo run -- scan tests/fixtures/mcp_servers/vuln_cmd_inject
cargo run -- scan . --ignore-tests --format html --output report.html
cargo run -- list-rules
```

## Architecture Principles

1. **Adapters produce IR, detectors consume IR.** Adding a new framework never changes any detector.
2. **All adapters run.** `auto_detect_and_load()` runs every matching adapter, not just the first.
3. **ArgumentSource is the taint abstraction.** Detectors check `is_tainted()` — no full dataflow needed.
4. **Policy is separate from detection.** Detectors always run; policy decides what to report and whether to fail.
5. **Cross-file analysis runs between parsing and detection.** Downgrades taint for functions that only receive sanitized input.

## Key Types

- `ScanTarget` — unified IR with 5 surfaces (tool, execution, data, dependency, provenance)
- `Finding` — detector output with severity, confidence, location, evidence, remediation
- `ArgumentSource``Literal` (safe), `Parameter` (tainted), `EnvVar`, `Interpolated`, `Unknown`, `Sanitized` (safe, v0.2.2)
- `Detector` trait — `metadata() -> RuleMetadata`, `run(&ScanTarget) -> Vec<Finding>`
- `PolicyVerdict` — pass/fail with threshold and highest severity
- `ScanConfig``[scan]` config section with `ignore_tests` bool
- `ParsedFile` — parser output with `commands`, `file_operations`, `network_operations`, `function_defs`, `call_sites`, `sanitized_vars`
- `FunctionDef` — extracted function definition with name, params, `is_exported`
- `CallSite` — function call with callee name, classified arguments, caller context

## Adapter Pipeline (3-phase, v0.2.2)

Adapters use a 3-phase pipeline:

```
Phase 1: Parse     — each source file → ParsedFile (with FunctionDef, CallSite, sanitized_vars)
Phase 2: Analyze   — apply_cross_file_sanitization() downgrades tainted params to Sanitized
Phase 3: Merge     — combine all ParsedFiles into ScanTarget surfaces
```

This eliminates false positives from internal helpers that receive already-validated input:

```typescript
// index.ts — handler validates input
const validPath = await validatePath(args.path);  // sanitizer detected
const content = await readFileContent(validPath);  // CallSite with Sanitized arg

// operations.ts — helper uses validated input
export async function readFileContent(filePath: string) {
    return fs.readFile(filePath, 'utf-8');  // Parameter downgraded → no SHIELD-004
}
```

## Cross-File Analysis (`src/analysis/cross_file.rs`)

The `apply_cross_file_sanitization()` function:

1. **Phase 1:** Builds function def map (`name → file_index, params, is_exported`)
2. **Phase 2:** Builds call-site map (`callee → Vec<argument_sources>`)
3. **Phase 3:** For each function, checks if ALL call sites pass safe args (Literal or Sanitized) per parameter
4. **Phase 4:** If all-safe, downgrades matching `ArgumentSource::Parameter` to `Sanitized` in the callee's operations

**Conservative rules:**
- Exported functions with zero discovered call sites stay tainted
- If ANY call site passes a tainted argument, the parameter stays tainted
- Only one level deep (caller → callee, not recursive)

**Sanitizer registry** (`is_sanitizer()`): recognizes `validatePath`, `path.resolve`, `os.path.realpath`, `parseInt`, `URL.parse`, and pattern-based matches like `validate*Path`, `sanitize*`.

## Test File Exclusion (`--ignore-tests`)

The `--ignore-tests` flag skips test files at the file-walking stage (before parsing). Available via:
- **CLI:** `agentshield scan . --ignore-tests`
- **Config:** `[scan] ignore_tests = true` in `.agentshield.toml`
- **GitHub Action:** `ignore-tests: true` input
- **Library:** `ScanOptions { ignore_tests: true, .. }`

CLI flag overrides config (`options.ignore_tests || config.scan.ignore_tests`).

`is_test_file()` in `src/adapter/mcp.rs` matches:
- Directories: `test/`, `tests/`, `__tests__/`, `__pycache__/`
- Suffixes: `.test.{ts,js,tsx,jsx,py}`, `.spec.{ts,js,tsx,jsx}`
- Prefixes: `test_*.py` (pytest)
- Config: `conftest.py`, `jest.config.*`, `vitest.config.*`, `pytest.ini`, `setup.cfg`

## Adding a New Detector

1. Create `src/rules/builtin/your_detector.rs`
2. Implement `Detector` trait (`metadata()` + `run()`)
3. Register in `src/rules/builtin/mod.rs``all_detectors()`
4. Add tests in the same file
5. Add fixture in `tests/fixtures/` if applicable
6. Run `cargo test && cargo clippy -- -D warnings`

## Adding a New Adapter

1. Create `src/adapter/your_framework.rs`
2. Implement `Adapter` trait (`framework()`, `detect()`, `load()`)
3. Register in `src/adapter/mod.rs``all_adapters()`
4. `detect()` checks for framework-specific files
5. `load()` uses the 3-phase pipeline (parse → cross-file analysis → merge)
6. Reuse shared helpers from `mcp.rs`: `collect_source_files()`, `parse_dependencies()`, `parse_provenance()`

**Existing adapters:** MCP (`mcp.rs`), OpenClaw (`openclaw.rs`), CrewAI (`crewai.rs`), LangChain (`langchain.rs`)

## Conventions

- `thiserror` for error types, `?` operator everywhere
- No `unwrap()` in production paths
- tree-sitter for AST parsing, regex for pattern matching and fallback
- Tests use real fixture files under `tests/fixtures/`
- Conventional Commits for git messages
- Parsers extract `FunctionDef`, `CallSite`, and `sanitized_vars` for cross-file analysis
- `ArgumentSource::Sanitized` is the safe variant for cross-file validated params — `is_tainted()` returns `false`
- v0.2.3 release has 5-platform binaries: https://github.com/limaronaldo/agentshield/releases/tag/v0.2.3
- PR inline annotations verified via [agentshield-test PR #1]https://github.com/limaronaldo/agentshield-test/pull/1 (IBVI-488)

## Version History

| Version | Tests | Key Feature |
|---------|-------|-------------|
| 0.1.0 | 46 | 12 detectors, Python parser, MCP/OpenClaw adapters |
| 0.2.0 | 69 | TypeScript tree-sitter parser, Homebrew, GitHub Action |
| 0.2.1 | 69 | Async HTTP detection, GitPython, typosquat allowlist, Marketplace |
| 0.2.2 | 83 | Cross-file validation tracking (IBVI-482) |
| 0.2.3 | 83 | `--ignore-tests` flag, `[scan]` config section, 5-platform release, PR annotations verified |
| 0.2.4 | 95 | CrewAI + LangChain adapters (IBVI-486, -487) — 4 adapters total, shared helpers |