skill-veil-core 0.1.3

Core library for skill-veil behavioral analysis
Documentation

Overview

skill-veil is an open source static analysis and policy tool for the agent extension supply chain.

It helps answer a narrow but useful operational question:

should this skill, prompt pack, instruction file, MCP manifest, or related artifact be allowed, reviewed, or blocked before it lands in a repo or CI pipeline?

It is strongest as a static security and policy layer, not as a universal malware engine.

Key Features

Feature Description
Agent Extension Coverage First-class support for SKILL.md, AGENTS.md, CLAUDE.md, SYSTEM.md, prompt packs, and MCP manifests
Artifact Analysis Inspects referenced scripts, manifests, lockfiles, Docker artifacts, and operational configs
Policy Engine log, require_approval, block with profiles, waivers, baselines, and overrides
CI-Friendly Output Text, JSON, SARIF, SHIELD, diff mode, compact CI summary, and PR gating support
External Rule Packs Versioned official and community rule packs with fixtures and validation
Benchmarking Labeled corpus, confidence calibration, threshold tuning, and release history dashboard
VirusTotal Integration Bulk download, report caching, and cross-check between skill-veil verdicts and VT Code Insight
LLM Enrichment Optional third scoring engine across Ollama, LM Studio, OpenAI, Anthropic, and Ollama Cloud
Inline Suppressions # skill-veil:ignore, nosem, and nosemgrep markers with optional rule-id and reason
Unified Config Single ~/.skill-veil.toml for VT and LLM providers; per-flag overrides on the CLI

What It Detects

Behavior        Remote execution, install hooks, deferred execution, persistence
Supply Chain    Unpinned dependencies, missing lockfiles, remote MCP endpoints
Prompt Risk     Persistent instruction tampering, cognitive rootkits, prompt packs
Tooling Risk    Tool abuse, autonomy escalation, approval bypass patterns
Runtime Risk    Privileged containers, host mounts, process execution, secret access
Artifacts       package.json, requirements.txt, pyproject.toml, Cargo.toml,
                Dockerfile, docker-compose, lockfiles, Makefile, .npmrc, pip.conf

Why a dedicated scanner for agent skills?

Generic malware scanners (VirusTotal, ClamAV, YARA-on-binaries) are designed for executables, archives, and URL/network reputation. Agent skills are markdown manifests where the malicious payload is prose — natural-language instructions that read credential files, persist across sessions, fetch remote "instructions" to execute, or bypass approval flows.

Skill-veil's rule pack targets that surface:

Threat class Skill-veil signals (examples)
Prompt injection (multilingual) OFFICIAL_PROMPT_TAMPERING_OVERRIDE_*, XML interaction-config
Autonomy bypass unbounded loops, "without confirmation" idioms (EN/PT/ES)
Persistence cron / heartbeat / callback to remote URL
Credential exposure reads of ~/.ssh, ~/.aws, .env, browser cookies
Remote instruction download multi-section fetch + execute
Agent neutralization rewrites of agent config to invalid endpoints
Hostile narrative ransom protocols, coercive framings

Benchmark on the VT-flagged corpus

We ran skill-veil over 2976 skills VirusTotal had labelled malicious (corpus and SHAs in benchmarks/vt-corpus.yaml). Treating VT's labels as ground truth, skill-veil reaches 91.73% recall at 100% precision (zero false positives on this corpus — 2730 TP / 246 FN / 0 FP).

For the residual false-negative bucket we ran a strict multi-provider LLM cross-check (Grok + OpenAI, default models grok-4-fast and gpt-4o-mini). A sample is treated as a VT mislabel only when all of the following hold:

  1. Both providers return verdict == benign.
  2. Both providers' confidence is ≥ 0.85.
  3. At least one provider's confidence is ≥ 0.90.

Of 246 samples submitted, 36 passed consensus (e.g., chart-image, mineru-pdf-style helpers); 210 were rejected (203 had at least one provider disagree, 6 were below the confidence floor, 1 was a binary-disguised file the LLMs could not analyse). Treating the 36 passing samples as VT mislabels lifts recall to 92.86% at the same 100% precision. Each override carries its per-provider verdicts, confidences, and timestamps in benchmarks/vt-baseline-overrides.yaml; the full audit including rejected samples is in benchmarks/multi-llm-audit.yaml.

A previous single-LLM pass (lmstudio only) accepted 131 of those 246 samples. Roughly three-quarters of that set did not survive the multi-provider consensus — a useful reminder that one model's opinion is not ground truth.

We are not claiming skill-veil outperforms VirusTotal. The two tools answer different questions:

  • VirusTotal aggregates dozens of AV engines and network/URL signals — strongest on binary reputation, supply-chain, and IOC correlation.
  • skill-veil reads the manifest prose itself — strongest on prompt-layer attacks that don't show up in static binary scanners.

A sufficiently adversarial skill could craft prose that fools both engines, which is why benchmarks/CLAUDE.md requires human review for any override touching secrets, credentials, or remote execution.

Use them together, not as substitutes.


Installation

From Source

git clone https://github.com/seifreed/skill-veil.git
cd skill-veil
cargo install --path crates/skill-veil-cli

From a GitHub Release

# Example
tar -xzf skill-veil-linux-x86_64.tar.gz
install -m 0755 skill-veil "$HOME/.local/bin/skill-veil"

Full installation notes: docs/installation.md


Quick Start

# Scan a strict entrypoint
skill-veil scan-file examples/malicious-skill/SKILL.md

# Scan a package with manifests and related artifacts
skill-veil scan-package examples/manifest-package --format text

# Scan agent-extension targets beyond SKILL.md
skill-veil scan-file examples/agent-instructions/AGENTS.md
skill-veil scan-package examples/prompt-pack
skill-veil scan-package examples/mcp-server

Usage

Command Line Interface

# Auto scan
skill-veil scan ./examples

# Strict explicit-entrypoint scan
skill-veil scan-file examples/safe-skill/SKILL.md

# Package scan
skill-veil scan-package . --format json --output current.json

# Dataset / marketplace / monorepo mode
skill-veil scan-dataset ./examples --preset ci --format text

Common Commands

Command Description
scan Auto-discover and scan files or directories
scan-file Scan a strict explicit entrypoint
scan-package Scan a package without promoting docs to entrypoints
scan-dataset Scan many packages in a repo, dataset, or marketplace mirror
benchmark Run the labeled benchmark corpus
baseline create Create a baseline from a JSON report
baseline update Update a baseline safely
waivers validate Validate waiver configuration
diff Compare two JSON reports with baseline/waiver awareness
rules validate Validate external rule packs
rules test Test one rule against inline content
rules test-pack Run pack fixtures
rules pack-info Summarize external rule packs
policy validate Validate a policy file
vt download Bulk-download a corpus from VirusTotal Intelligence with cached reports
vt report Fetch and cache the VT report for a single hash
vt cross-check Compare skill-veil verdicts against VT Code Insight on a downloaded corpus

Useful Options

Option Description
--format text/json/sarif/shield Output format
--preset local/ci/strict/enterprise Apply output and policy presets
--quiet-summary Compact text output
--explain-policy Focus on policy reasoning instead of finding details
--baseline Accepted findings baseline
--waivers Waiver file
--policy Policy file
--ci-summary Compact diff summary for CI
--fail-on <mode> CI diff failure mode (new-active or new-blocking)
--dashboard-output Write benchmark history dashboard
--no-vt-enrich Skip VT enrichment even when ~/.skill-veil.toml provides an apikey
--no-llm-enrich Skip LLM enrichment even when an [llm] section is configured
--llm-provider <name> Override the active LLM provider for one scan (ollama, lmstudio, openai, anthropic, ollama-cloud)
--cache-dir Override the base directory for VT and LLM enrichment caches

Examples

Review a suspicious package

skill-veil scan-package examples/suspicious-skill --format text

Generate a report for CI

skill-veil scan-package . --preset ci --format json --output current.json
skill-veil scan-package . --preset ci --format sarif --output current.sarif

Baseline + diff workflow

skill-veil baseline create current.json --output .skill-veil/baseline.json
skill-veil diff prev.json current.json --baseline .skill-veil/baseline.json --ci-summary --fail-on new-active

Benchmark with history and dashboard

skill-veil benchmark benchmarks/corpus.yaml \
  --format json \
  --output benchmarks/history/latest.json \
  --history-file benchmarks/history/releases.json \
  --release-id local-dev \
  --dashboard-output benchmarks/history/dashboard.md

Rule pack development

skill-veil rules validate --rules-dir rules/official
skill-veil rules test-pack --rules-dir rules/official --fixtures rules/fixtures/behavioral.yaml
skill-veil rules pack-info --rules-dir rules/official

VirusTotal corpus and cross-check

# One-time setup: ~/.skill-veil.toml
# [vt]
# apikey = "..."

# Download a labeled corpus from VT Intelligence (reports + samples).
skill-veil vt download \
  --query 'entity:file has:codeinsight codeinsight_verdict:malicious' \
  --dest data --limit 200

# Pull a single VT report into the cache.
skill-veil vt report deadbeef0123...0123

# Compare skill-veil verdicts against VT Code Insight for a downloaded corpus.
skill-veil vt cross-check --dir data --format markdown --only-mismatches

LLM enrichment as a third scoring engine

# Add to ~/.skill-veil.toml:
# [llm]
# provider = "ollama"
#
# [llm.ollama]
# model = "llama3.1:8b"
# # base_url = "http://127.0.0.1:11434"   # optional

# Enrichment runs automatically alongside the rule + verdict engines.
skill-veil scan-package examples/manifest-package --format json --output current.json

# Override provider for a single run without touching the config.
skill-veil scan-package . --llm-provider openai

# Skip enrichment entirely (CI runs that should not depend on a network model).
skill-veil scan-package . --no-vt-enrich --no-llm-enrich

Supported providers out of the box: Ollama, LM Studio, OpenAI, Anthropic, and Ollama Cloud. Each provider exposes its own section in ~/.skill-veil.toml ([llm.ollama], [llm.openai], etc.) for model name, optional base URL, and provider-specific parameters.

Inline suppressions in scanned content

# skill-veil:ignore SKILL_REMOTE_EXEC_CURL_BASH because: vendor install script reviewed manually
curl -sSL https://example.com/install.sh | bash

skill-veil also recognises nosem, nosem-next-line, nosemgrep, and nosemgrep-next-line for compatibility with existing toolchains. An optional because: / reason: clause is captured in the finding metadata so reviewers can audit waivers later.

Optional YARA support

cargo run -p skill-veil --features yara -- rules validate --rules-dir rules/official

YARA usage notes and an example rule live in:

External dataset validation

For marketplace mirrors or local corpora that are intentionally kept out of Git:

Curated example packages

  • safe skill: examples/safe-skill/
  • suspicious skill: examples/suspicious-skill/
  • malicious skill: examples/malicious-skill/
  • manifest-heavy package: examples/manifest-package/
  • referenced script package: examples/referenced-script-package/
  • agent instructions: examples/agent-instructions/
  • prompt pack: examples/prompt-pack/
  • MCP manifest: examples/mcp-server/

Daily analyst triage

skill-veil scan-dataset ./mirror \
  --dataset-view verdicts \
  --analyst-summary \
  --preset local \
  --format text

That view is intentionally short and stable for daily review:

  • package id
  • verdict
  • package health
  • blast radius
  • top rule
  • strongest scope/reason

Use Cases

1. Review a third-party skill before installing it

Use this when someone shares a SKILL.md, AGENTS.md, or similar entrypoint and you want a fast local decision.

skill-veil scan-file path/to/SKILL.md --format text

What you get:

  • findings grouped by severity and category
  • a final action: log, require_approval, or block
  • policy escalation reasons if the artifact implies extra blast radius

2. Review a whole package, not only the root document

Use this when a skill repo also contains manifests, install hooks, scripts, or container files.

skill-veil scan-package /path/to/repo --format text

This is the most important mode for real reviews because it inspects:

  • the explicit entrypoint
  • referenced scripts
  • manifests and lockfiles
  • Docker and runtime artifacts

3. Scan agent instruction files and prompt packs

Use this when the risky part is not a classic skill but a persistent instruction surface.

skill-veil scan-file examples/agent-instructions/AGENTS.md
skill-veil scan-package examples/prompt-pack

This is useful for:

  • persistent prompt tampering
  • cognitive rootkits
  • approval bypass patterns
  • prompt-pack review before publishing or importing

4. Review an MCP manifest before enabling a server

Use this when you want to inspect an MCP server descriptor for remote connectivity, command execution, or tool-scope concerns.

skill-veil scan-package examples/mcp-server --format json

5. Add a CI gate to block only new active findings

Use this when you already have accepted debt and only want to stop regressions.

skill-veil scan-package . --preset ci --format json --output current.json
skill-veil diff prev.json current.json --baseline .skill-veil/baseline.json --ci-summary --fail-on new-active

This is the practical workflow for teams because it separates:

  • existing accepted findings
  • waived findings
  • new active findings

6. Manage accepted risk with baseline and waivers

Use this when some findings are known and reviewed, but you still want the tool to stay strict about new ones.

skill-veil baseline create current.json --output .skill-veil/baseline.json
skill-veil waivers validate .skill-veil/waivers.yaml
skill-veil scan-package . --baseline .skill-veil/baseline.json --waivers .skill-veil/waivers.yaml

7. Scan a catalog, dataset, or marketplace mirror

Use this when you have many packages and want aggregate review instead of single-file analysis.

skill-veil scan-dataset ./examples --preset ci --format text

This is the right mode for:

  • internal marketplaces
  • downloaded skill corpora
  • large monorepos of agent extensions

8. Measure whether the scanner got better or worse

Use this when changing rules, scoring, or analyzers.

skill-veil benchmark benchmarks/corpus.yaml \
  --format json \
  --output benchmarks/history/latest.json \
  --history-file benchmarks/history/releases.json \
  --release-id local-dev \
  --dashboard-output benchmarks/history/dashboard.md

This tells you:

  • precision and recall
  • false positive rate
  • exact label accuracy
  • confidence calibration
  • threshold recommendations
  • release-to-release trend

Output Formats

Format Use Case
text Local review
json Automation, baselines, diff, dashboards
sarif GitHub Code Scanning
shield Policy-oriented markdown

Benchmarking

The repository ships with a labeled benchmark corpus and release history.

Current benchmark reporting includes:

  • precision
  • recall
  • false positive rate
  • accuracy
  • exact label accuracy
  • TP / FP / TN / FN
  • corpus coverage by label and focus category
  • confidence calibration by evidence, category, and signal pair
  • threshold recommendations
  • markdown dashboard for release-to-release comparison

Methodology: docs/benchmark-methodology.md


Rule Packs

External versioned packs under rules/official/ are the primary default rule source. Embedded rules are a fallback only.

Rule pack docs:


Documentation


Contributing

Contributions are welcome.

Start here:


Support the Project

If skill-veil is useful to you, consider supporting its maintenance:


License

This project is licensed under the MIT License. See LICENSE.

Attribution: