Overview
skill-veil is an open source static analysis and policy tool for the agent extension supply chain.
It helps answer a narrow but useful operational question:
should this skill, prompt pack, instruction file, MCP manifest, or related artifact be allowed, reviewed, or blocked before it lands in a repo or CI pipeline?
It is strongest as a static security and policy layer, not as a universal malware engine.
Key Features
| Feature | Description |
|---|---|
| Agent Extension Coverage | First-class support for SKILL.md, AGENTS.md, CLAUDE.md, SYSTEM.md, prompt packs, and MCP manifests |
| Artifact Analysis | Inspects referenced scripts, manifests, lockfiles, Docker artifacts, and operational configs |
| Policy Engine | log, require_approval, block with profiles, waivers, baselines, and overrides |
| CI-Friendly Output | Text, JSON, SARIF, SHIELD, diff mode, compact CI summary, and PR gating support |
| External Rule Packs | Versioned official and community rule packs with fixtures and validation |
| Benchmarking | Labeled corpus, confidence calibration, threshold tuning, and release history dashboard |
| VirusTotal Integration | Bulk download, report caching, and cross-check between skill-veil verdicts and VT Code Insight |
| LLM Enrichment | Optional third scoring engine across Ollama, LM Studio, OpenAI, Anthropic, and Ollama Cloud |
| Inline Suppressions | # skill-veil:ignore, nosem, and nosemgrep markers with optional rule-id and reason |
| Unified Config | Single ~/.skill-veil.toml for VT and LLM providers; per-flag overrides on the CLI |
What It Detects
Behavior Remote execution, install hooks, deferred execution, persistence
Supply Chain Unpinned dependencies, missing lockfiles, remote MCP endpoints
Prompt Risk Persistent instruction tampering, cognitive rootkits, prompt packs
Tooling Risk Tool abuse, autonomy escalation, approval bypass patterns
Runtime Risk Privileged containers, host mounts, process execution, secret access
Artifacts package.json, requirements.txt, pyproject.toml, Cargo.toml,
Dockerfile, docker-compose, lockfiles, Makefile, .npmrc, pip.conf
Why a dedicated scanner for agent skills?
Generic malware scanners (VirusTotal, ClamAV, YARA-on-binaries) are designed for executables, archives, and URL/network reputation. Agent skills are markdown manifests where the malicious payload is prose — natural-language instructions that read credential files, persist across sessions, fetch remote "instructions" to execute, or bypass approval flows.
Skill-veil's rule pack targets that surface:
| Threat class | Skill-veil signals (examples) |
|---|---|
| Prompt injection (multilingual) | OFFICIAL_PROMPT_TAMPERING_OVERRIDE_*, XML interaction-config |
| Autonomy bypass | unbounded loops, "without confirmation" idioms (EN/PT/ES) |
| Persistence | cron / heartbeat / callback to remote URL |
| Credential exposure | reads of ~/.ssh, ~/.aws, .env, browser cookies |
| Remote instruction download | multi-section fetch + execute |
| Agent neutralization | rewrites of agent config to invalid endpoints |
| Hostile narrative | ransom protocols, coercive framings |
Benchmark on the VT-flagged corpus
We ran skill-veil over 2976 skills VirusTotal had labelled malicious
(corpus and SHAs in benchmarks/vt-corpus.yaml). Treating VT's labels
as ground truth, skill-veil reaches 91.73% recall at 100%
precision (zero false positives on this corpus —
2730 TP / 246 FN / 0 FP).
For the residual false-negative bucket we ran a strict multi-provider
LLM cross-check (Grok + OpenAI, default models grok-4-fast and
gpt-4o-mini). A sample is treated as a VT mislabel only when all
of the following hold:
- Both providers return
verdict == benign. - Both providers' confidence is ≥ 0.85.
- At least one provider's confidence is ≥ 0.90.
Of 246 samples submitted, 36 passed consensus (e.g., chart-image,
mineru-pdf-style helpers); 210 were rejected (203 had at least one
provider disagree, 6 were below the confidence floor, 1 was a
binary-disguised file the LLMs could not analyse). Treating the 36
passing samples as VT mislabels lifts recall to 92.86% at the same
100% precision. Each override carries its per-provider verdicts,
confidences, and timestamps in
benchmarks/vt-baseline-overrides.yaml; the full audit including
rejected samples is in benchmarks/multi-llm-audit.yaml.
A previous single-LLM pass (lmstudio only) accepted 131 of those 246 samples. Roughly three-quarters of that set did not survive the multi-provider consensus — a useful reminder that one model's opinion is not ground truth.
We are not claiming skill-veil outperforms VirusTotal. The two tools answer different questions:
- VirusTotal aggregates dozens of AV engines and network/URL signals — strongest on binary reputation, supply-chain, and IOC correlation.
- skill-veil reads the manifest prose itself — strongest on prompt-layer attacks that don't show up in static binary scanners.
A sufficiently adversarial skill could craft prose that fools both
engines, which is why benchmarks/CLAUDE.md requires human review
for any override touching secrets, credentials, or remote execution.
Use them together, not as substitutes.
Installation
From Source
From a GitHub Release
# Example
Full installation notes: docs/installation.md
Quick Start
# Scan a strict entrypoint
# Scan a package with manifests and related artifacts
# Scan agent-extension targets beyond SKILL.md
Usage
Command Line Interface
# Auto scan
# Strict explicit-entrypoint scan
# Package scan
# Dataset / marketplace / monorepo mode
Common Commands
| Command | Description |
|---|---|
scan |
Auto-discover and scan files or directories |
scan-file |
Scan a strict explicit entrypoint |
scan-package |
Scan a package without promoting docs to entrypoints |
scan-dataset |
Scan many packages in a repo, dataset, or marketplace mirror |
benchmark |
Run the labeled benchmark corpus |
baseline create |
Create a baseline from a JSON report |
baseline update |
Update a baseline safely |
waivers validate |
Validate waiver configuration |
diff |
Compare two JSON reports with baseline/waiver awareness |
rules validate |
Validate external rule packs |
rules test |
Test one rule against inline content |
rules test-pack |
Run pack fixtures |
rules pack-info |
Summarize external rule packs |
policy validate |
Validate a policy file |
vt download |
Bulk-download a corpus from VirusTotal Intelligence with cached reports |
vt report |
Fetch and cache the VT report for a single hash |
vt cross-check |
Compare skill-veil verdicts against VT Code Insight on a downloaded corpus |
Useful Options
| Option | Description |
|---|---|
--format text/json/sarif/shield |
Output format |
--preset local/ci/strict/enterprise |
Apply output and policy presets |
--quiet-summary |
Compact text output |
--explain-policy |
Focus on policy reasoning instead of finding details |
--baseline |
Accepted findings baseline |
--waivers |
Waiver file |
--policy |
Policy file |
--ci-summary |
Compact diff summary for CI |
--fail-on <mode> |
CI diff failure mode (new-active or new-blocking) |
--dashboard-output |
Write benchmark history dashboard |
--no-vt-enrich |
Skip VT enrichment even when ~/.skill-veil.toml provides an apikey |
--no-llm-enrich |
Skip LLM enrichment even when an [llm] section is configured |
--llm-provider <name> |
Override the active LLM provider for one scan (ollama, lmstudio, openai, anthropic, ollama-cloud) |
--cache-dir |
Override the base directory for VT and LLM enrichment caches |
Examples
Review a suspicious package
Generate a report for CI
Baseline + diff workflow
Benchmark with history and dashboard
Rule pack development
VirusTotal corpus and cross-check
# One-time setup: ~/.skill-veil.toml
# [vt]
# apikey = "..."
# Download a labeled corpus from VT Intelligence (reports + samples).
# Pull a single VT report into the cache.
# Compare skill-veil verdicts against VT Code Insight for a downloaded corpus.
LLM enrichment as a third scoring engine
# Add to ~/.skill-veil.toml:
# [llm]
# provider = "ollama"
#
# [llm.ollama]
# model = "llama3.1:8b"
# # base_url = "http://127.0.0.1:11434" # optional
# Enrichment runs automatically alongside the rule + verdict engines.
# Override provider for a single run without touching the config.
# Skip enrichment entirely (CI runs that should not depend on a network model).
Supported providers out of the box: Ollama, LM Studio, OpenAI,
Anthropic, and Ollama Cloud. Each provider exposes its own section in
~/.skill-veil.toml ([llm.ollama], [llm.openai], etc.) for model name,
optional base URL, and provider-specific parameters.
Inline suppressions in scanned content
skill-veil also recognises nosem, nosem-next-line, nosemgrep, and
nosemgrep-next-line for compatibility with existing toolchains. An optional
because: / reason: clause is captured in the finding metadata so reviewers
can audit waivers later.
Optional YARA support
YARA usage notes and an example rule live in:
External dataset validation
For marketplace mirrors or local corpora that are intentionally kept out of Git:
Curated example packages
- safe skill:
examples/safe-skill/ - suspicious skill:
examples/suspicious-skill/ - malicious skill:
examples/malicious-skill/ - manifest-heavy package:
examples/manifest-package/ - referenced script package:
examples/referenced-script-package/ - agent instructions:
examples/agent-instructions/ - prompt pack:
examples/prompt-pack/ - MCP manifest:
examples/mcp-server/
Daily analyst triage
That view is intentionally short and stable for daily review:
- package id
- verdict
- package health
- blast radius
- top rule
- strongest scope/reason
Use Cases
1. Review a third-party skill before installing it
Use this when someone shares a SKILL.md, AGENTS.md, or similar entrypoint
and you want a fast local decision.
What you get:
- findings grouped by severity and category
- a final action:
log,require_approval, orblock - policy escalation reasons if the artifact implies extra blast radius
2. Review a whole package, not only the root document
Use this when a skill repo also contains manifests, install hooks, scripts, or container files.
This is the most important mode for real reviews because it inspects:
- the explicit entrypoint
- referenced scripts
- manifests and lockfiles
- Docker and runtime artifacts
3. Scan agent instruction files and prompt packs
Use this when the risky part is not a classic skill but a persistent instruction surface.
This is useful for:
- persistent prompt tampering
- cognitive rootkits
- approval bypass patterns
- prompt-pack review before publishing or importing
4. Review an MCP manifest before enabling a server
Use this when you want to inspect an MCP server descriptor for remote connectivity, command execution, or tool-scope concerns.
5. Add a CI gate to block only new active findings
Use this when you already have accepted debt and only want to stop regressions.
This is the practical workflow for teams because it separates:
- existing accepted findings
- waived findings
- new active findings
6. Manage accepted risk with baseline and waivers
Use this when some findings are known and reviewed, but you still want the tool to stay strict about new ones.
7. Scan a catalog, dataset, or marketplace mirror
Use this when you have many packages and want aggregate review instead of single-file analysis.
This is the right mode for:
- internal marketplaces
- downloaded skill corpora
- large monorepos of agent extensions
8. Measure whether the scanner got better or worse
Use this when changing rules, scoring, or analyzers.
This tells you:
- precision and recall
- false positive rate
- exact label accuracy
- confidence calibration
- threshold recommendations
- release-to-release trend
Output Formats
| Format | Use Case |
|---|---|
text |
Local review |
json |
Automation, baselines, diff, dashboards |
sarif |
GitHub Code Scanning |
shield |
Policy-oriented markdown |
Benchmarking
The repository ships with a labeled benchmark corpus and release history.
Current benchmark reporting includes:
- precision
- recall
- false positive rate
- accuracy
- exact label accuracy
- TP / FP / TN / FN
- corpus coverage by label and focus category
- confidence calibration by evidence, category, and signal pair
- threshold recommendations
- markdown dashboard for release-to-release comparison
Methodology: docs/benchmark-methodology.md
Rule Packs
External versioned packs under rules/official/ are the primary default rule
source. Embedded rules are a fallback only.
Rule pack docs:
Documentation
- docs/architecture.md
- docs/changelog.md
- docs/roadmap.md
- docs/threat-model.md
- docs/usage-local.md
- docs/usage-ci.md
- docs/agent-extensions.md
- docs/policy-model.md
- docs/policy-presets.md
- docs/finding-model.md
- docs/verdict-model.md
- docs/analyst-interpretation.md
- docs/json-report-schema-v3.md
- docs/artifact-analysis.md
- docs/release-process.md
Contributing
Contributions are welcome.
Start here:
Support the Project
If skill-veil is useful to you, consider supporting its maintenance:
License
This project is licensed under the MIT License. See LICENSE.
Attribution:
- Repository: github.com/seifreed/skill-veil