skill-veil-core 0.1.0

Overview

skill-veil is an open source static analysis and policy tool for the agent extension supply chain.

It helps answer a narrow but useful operational question:

should this skill, prompt pack, instruction file, MCP manifest, or related artifact be allowed, reviewed, or blocked before it lands in a repo or CI pipeline?

It is strongest as a static security and policy layer, not as a universal malware engine.

Key Features

Feature	Description
Agent Extension Coverage	First-class support for `SKILL.md`, `AGENTS.md`, `CLAUDE.md`, `SYSTEM.md`, prompt packs, and MCP manifests
Artifact Analysis	Inspects referenced scripts, manifests, lockfiles, Docker artifacts, and operational configs
Policy Engine	`log`, `require_approval`, `block` with profiles, waivers, baselines, and overrides
CI-Friendly Output	Text, JSON, SARIF, SHIELD, diff mode, compact CI summary, and PR gating support
External Rule Packs	Versioned `official` and `community` rule packs with fixtures and validation
Benchmarking	Labeled corpus, confidence calibration, threshold tuning, and release history dashboard

What It Detects

Behavior        Remote execution, install hooks, deferred execution, persistence
Supply Chain    Unpinned dependencies, missing lockfiles, remote MCP endpoints
Prompt Risk     Persistent instruction tampering, cognitive rootkits, prompt packs
Tooling Risk    Tool abuse, autonomy escalation, approval bypass patterns
Runtime Risk    Privileged containers, host mounts, process execution, secret access
Artifacts       package.json, requirements.txt, pyproject.toml, Cargo.toml,
                Dockerfile, docker-compose, lockfiles, Makefile, .npmrc, pip.conf

Installation

From Source

git clone https://github.com/seifreed/skill-veil.git
cd skill-veil
cargo install --path crates/skill-veil-cli

From a GitHub Release

# Example
tar -xzf skill-veil-linux-x86_64.tar.gz
install -m 0755 skill-veil "$HOME/.local/bin/skill-veil"

Full installation notes: docs/installation.md

Quick Start

# Scan a strict entrypoint
skill-veil scan-file examples/malicious-skill/SKILL.md

# Scan a package with manifests and related artifacts
skill-veil scan-package examples/manifest-package --format text

# Scan agent-extension targets beyond SKILL.md
skill-veil scan-file examples/agent-instructions/AGENTS.md
skill-veil scan-package examples/prompt-pack
skill-veil scan-package examples/mcp-server

Usage

Command Line Interface

# Auto scan
skill-veil scan ./examples

# Strict explicit-entrypoint scan
skill-veil scan-file examples/safe-skill/SKILL.md

# Package scan
skill-veil scan-package . --format json --output current.json

# Dataset / marketplace / monorepo mode
skill-veil scan-dataset ./examples --preset ci --format text

Common Commands

Command	Description
`scan`	Auto-discover and scan files or directories
`scan-file`	Scan a strict explicit entrypoint
`scan-package`	Scan a package without promoting docs to entrypoints
`scan-dataset`	Scan many packages in a repo, dataset, or marketplace mirror
`benchmark`	Run the labeled benchmark corpus
`baseline create`	Create a baseline from a JSON report
`baseline update`	Update a baseline safely
`waivers validate`	Validate waiver configuration
`diff`	Compare two JSON reports with baseline/waiver awareness
`rules validate`	Validate external rule packs
`rules test`	Test one rule against inline content
`rules test-pack`	Run pack fixtures
`rules pack-info`	Summarize external rule packs
`policy validate`	Validate a policy file

Useful Options

Option	Description
`--format text/json/sarif/shield`	Output format
`--preset local/ci/strict/enterprise`	Apply output and policy presets
`--quiet-summary`	Compact text output
`--explain-policy`	Focus on policy reasoning instead of finding details
`--baseline`	Accepted findings baseline
`--waivers`	Waiver file
`--policy`	Policy file
`--ci-summary`	Compact diff summary for CI
`--fail-on <mode>`	CI diff failure mode (`new-active` or `new-blocking`)
`--dashboard-output`	Write benchmark history dashboard

Examples

Review a suspicious package

skill-veil scan-package examples/suspicious-skill --format text

Generate a report for CI

skill-veil scan-package . --preset ci --format json --output current.json
skill-veil scan-package . --preset ci --format sarif --output current.sarif

Baseline + diff workflow

skill-veil baseline create current.json --output .skill-veil/baseline.json
skill-veil diff prev.json current.json --baseline .skill-veil/baseline.json --ci-summary --fail-on new-active

Benchmark with history and dashboard

skill-veil benchmark benchmarks/corpus.yaml \
  --format json \
  --output benchmarks/history/latest.json \
  --history-file benchmarks/history/releases.json \
  --release-id local-dev \
  --dashboard-output benchmarks/history/dashboard.md

Rule pack development

skill-veil rules validate --rules-dir rules/official
skill-veil rules test-pack --rules-dir rules/official --fixtures rules/fixtures/behavioral.yaml
skill-veil rules pack-info --rules-dir rules/official

Optional YARA support

cargo run -p skill-veil --features yara -- rules validate --rules-dir rules/official

YARA usage notes and an example rule live in:

External dataset validation

For marketplace mirrors or local corpora that are intentionally kept out of Git:

docs/dataset-validation.md

Curated example packages

safe skill: examples/safe-skill/
suspicious skill: examples/suspicious-skill/
malicious skill: examples/malicious-skill/
manifest-heavy package: examples/manifest-package/
referenced script package: examples/referenced-script-package/
agent instructions: examples/agent-instructions/
prompt pack: examples/prompt-pack/
MCP manifest: examples/mcp-server/

Daily analyst triage

skill-veil scan-dataset ./mirror \
  --dataset-view verdicts \
  --analyst-summary \
  --preset local \
  --format text

That view is intentionally short and stable for daily review:

package id
verdict
package health
blast radius
top rule
strongest scope/reason

Use Cases

1. Review a third-party skill before installing it

Use this when someone shares a SKILL.md, AGENTS.md, or similar entrypoint and you want a fast local decision.

skill-veil scan-file path/to/SKILL.md --format text

What you get:

findings grouped by severity and category
a final action: log, require_approval, or block
policy escalation reasons if the artifact implies extra blast radius

2. Review a whole package, not only the root document

Use this when a skill repo also contains manifests, install hooks, scripts, or container files.

skill-veil scan-package /path/to/repo --format text

This is the most important mode for real reviews because it inspects:

the explicit entrypoint
referenced scripts
manifests and lockfiles
Docker and runtime artifacts

3. Scan agent instruction files and prompt packs

Use this when the risky part is not a classic skill but a persistent instruction surface.

skill-veil scan-file examples/agent-instructions/AGENTS.md
skill-veil scan-package examples/prompt-pack

This is useful for:

persistent prompt tampering
cognitive rootkits
approval bypass patterns
prompt-pack review before publishing or importing

4. Review an MCP manifest before enabling a server

Use this when you want to inspect an MCP server descriptor for remote connectivity, command execution, or tool-scope concerns.

skill-veil scan-package examples/mcp-server --format json

5. Add a CI gate to block only new active findings

Use this when you already have accepted debt and only want to stop regressions.

skill-veil scan-package . --preset ci --format json --output current.json
skill-veil diff prev.json current.json --baseline .skill-veil/baseline.json --ci-summary --fail-on new-active

This is the practical workflow for teams because it separates:

existing accepted findings
waived findings
new active findings

6. Manage accepted risk with baseline and waivers

Use this when some findings are known and reviewed, but you still want the tool to stay strict about new ones.

skill-veil baseline create current.json --output .skill-veil/baseline.json
skill-veil waivers validate .skill-veil/waivers.yaml
skill-veil scan-package . --baseline .skill-veil/baseline.json --waivers .skill-veil/waivers.yaml

7. Scan a catalog, dataset, or marketplace mirror

Use this when you have many packages and want aggregate review instead of single-file analysis.

skill-veil scan-dataset ./examples --preset ci --format text

This is the right mode for:

internal marketplaces
downloaded skill corpora
large monorepos of agent extensions

8. Measure whether the scanner got better or worse

Use this when changing rules, scoring, or analyzers.

skill-veil benchmark benchmarks/corpus.yaml \
  --format json \
  --output benchmarks/history/latest.json \
  --history-file benchmarks/history/releases.json \
  --release-id local-dev \
  --dashboard-output benchmarks/history/dashboard.md

This tells you:

precision and recall
false positive rate
exact label accuracy
confidence calibration
threshold recommendations
release-to-release trend

Output Formats

Format	Use Case
`text`	Local review
`json`	Automation, baselines, diff, dashboards
`sarif`	GitHub Code Scanning
`shield`	Policy-oriented markdown

Benchmarking

The repository ships with a labeled benchmark corpus and release history.

Current benchmark reporting includes:

precision
recall
false positive rate
accuracy
exact label accuracy
TP / FP / TN / FN
corpus coverage by label and focus category
confidence calibration by evidence, category, and signal pair
threshold recommendations
markdown dashboard for release-to-release comparison

Methodology: docs/benchmark-methodology.md

Rule Packs

External versioned packs under rules/official/ are the primary default rule source. Embedded rules are a fallback only.

Rule pack docs:

Documentation

Contributing

Contributions are welcome.

Start here:

Support the Project

If skill-veil is useful to you, consider supporting its maintenance:

License

This project is licensed under the MIT License. See LICENSE.

Attribution:

Repository: github.com/seifreed/skill-veil