signet-eval
Deterministic policy enforcement for AI agent tool calls. Every action an agent proposes passes through user-defined rules before execution. No LLM in the authorization path. No prompt injection surface. 25ms end-to-end.
Install
Quick Start
1. Hook into Claude Code — add to ~/.claude/settings.json:
2. Done. Every tool call now passes through policy evaluation. The default policy blocks destructive operations, protects its own configuration, and allows everything else.
3. (Optional) Customize — talk to Claude with the MCP server:
Then say: "Add a $50 limit for amazon orders" or "Block all rm commands".
Default Policy
Self-protection rules are locked — they cannot be removed, edited, or reordered by the AI agent, even through the MCP management server. This prevents the agent from disabling its own guardrails.
| Action | Decision | Locked |
|---|---|---|
Write/Edit/Bash touching .signet/ |
deny | yes |
Write/Edit/Bash touching signet-eval binary |
deny | yes |
Write/Edit settings.json / settings.local.json |
ask | yes |
Bash kill/pkill/killall + signet |
deny | yes |
| Edit/Write/NotebookEdit without recent plan | ask | |
| Edit/Write on core/DSL/schema paths | ask | |
rm, rmdir |
deny | |
git push --force |
ask | |
mkfs, format, dd if= |
deny | |
curl | sh, wget | sh |
deny | |
| Everything else | allow |
Custom Policy
Edit ~/.signet/policy.yaml:
version: 1
default_action: ALLOW
rules:
- name: block_rm
tool_pattern: ".*"
conditions:
action: DENY
reason: "File deletion blocked"
- name: books_limit
tool_pattern: ".*purchase.*"
conditions:
- "param_eq(category, 'books')"
- "spend_plus_amount_gt('books', amount, 200)"
action: DENY
reason: "Books spending limit ($200) exceeded"
- name: protect_my_config
tool_pattern: ".*"
conditions:
action: ASK
locked: true
reason: "System config changes require confirmation"
Rules are evaluated in order — first match wins. Multiple conditions on a rule are AND'd. Rules with locked: true cannot be modified through the MCP management server.
Condition Functions
| Function | Description | Example |
|---|---|---|
contains(parameters, 'X') |
Tool input contains string | contains(parameters, 'rm ') |
any_of(parameters, 'X', 'Y') |
Any string present | any_of(parameters, 'mkfs', 'format') |
param_eq(field, 'value') |
Field equals value | param_eq(category, 'books') |
param_ne(field, 'value') |
Field not equal | param_ne(role, 'admin') |
param_gt(field, N) |
Field > number | param_gt(amount, 100) |
param_lt(field, N) |
Field < number | param_lt(amount, 5) |
param_contains(field, 'X') |
Field contains substring | param_contains(command, 'sudo') |
matches(field, 'regex') |
Field matches regex | matches(file_path, '\\.env$') |
has_credential('name') |
Credential exists in vault | has_credential('cc_visa') |
spend_gt('cat', N) |
Session spend > limit | spend_gt('books', 200) |
spend_plus_amount_gt('cat', field, N) |
Spend + this amount > limit | spend_plus_amount_gt('books', amount, 200) |
not(condition) |
Negate condition | not(param_eq(format, 'json')) |
or(A || B) |
Either condition | or(contains(parameters, '-f') || contains(parameters, '--force')) |
has_recent_action('search', N) |
Recent allowed action matches in tool name or detail; pipe-delimited OR | has_recent_action('EnterPlanMode|TaskCreate', 500) |
true / false |
Literal | true |
Encrypted Vault
Three-tier encrypted storage with passphrase-derived key hierarchy (Argon2id + AES-256-GCM):
| Tier | Encryption | Contents |
|---|---|---|
| 1 | None | Action log, spending ledger |
| 2 | Session key | Session state |
| 3 | Compartment key | CC numbers, API tokens, secrets |
Credentials support scoped access via request_capability: domain restrictions, purpose constraints, per-use amount caps, and one-time tokens that auto-invalidate after a single use.
Spending limits use the vault ledger — each tool call that spends money is logged, and spend_plus_amount_gt() checks cumulative totals before allowing the next purchase.
Self-Protection
signet-eval ships with four locked rules that prevent an AI agent from disabling its own policy enforcement:
- protect_signet_dir — Denies any Write, Edit, or Bash command touching
.signet/(policy files, vault, HMAC) - protect_signet_binary — Denies tampering with the
signet-evalbinary itself - protect_hook_config — Requires user confirmation before modifying
settings.json(where the hook is configured) - protect_signet_process — Denies kill/pkill/killall commands targeting signet processes
These rules are:
- Locked — MCP tools refuse to remove, edit, or reorder them
- Position-protected — Unlocked rules cannot be reordered above locked rules (first-match-wins)
- Hardcoded in defaults — If the policy file is corrupted or missing, the binary falls back to hardcoded defaults that include self-protection
- HMAC-backed — Direct file edits break the policy signature, triggering fallback to safe defaults
MCP Management Server
Manage policies conversationally through Claude:
| Tool | Purpose |
|---|---|
signet_list_rules |
Show all rules with locked status |
signet_add_rule |
Add a new rule (appended after locked rules) |
signet_remove_rule |
Remove a rule (refuses on locked rules) |
signet_edit_rule |
Modify rule properties (refuses on locked rules) |
signet_reorder_rule |
Move a rule (refuses on locked, prevents placing above locked) |
signet_set_limit |
Set a spending limit for a category |
signet_test |
Test a tool call against the current policy |
signet_validate |
Check policy for errors |
signet_condition_help |
Show available condition functions |
signet_status |
Vault status, spending totals, credential count |
signet_recent_actions |
Show recent action log |
signet_store_credential |
Store a Tier 3 credential |
signet_use_credential |
Request a credential through capability constraints |
signet_list_credentials |
List credential names |
signet_delete_credential |
Delete a credential |
signet_sign_policy |
HMAC-sign the policy file |
signet_reset_session |
Clear spending counters |
All mutating operations auto-sign the policy when the vault is available.
MCP Proxy
Wrap upstream MCP servers with policy enforcement. The agent connects to the proxy, never directly to servers. Policy is hot-reloaded on every call.
# Configure upstream servers
# Register proxy with Claude Code
All Commands
| Command | Purpose |
|---|---|
signet-eval |
Hook evaluation (default, 25ms) |
signet-eval init |
Write default policy with locked self-protection rules |
signet-eval rules |
Show current policy rules (locked rules tagged) |
signet-eval validate |
Check policy for errors |
signet-eval test '<json>' |
Test a tool call against policy |
signet-eval setup |
Create encrypted vault |
signet-eval unlock |
Refresh vault session |
signet-eval status |
Vault status and spending |
signet-eval store <name> <value> |
Store Tier 3 credential |
signet-eval delete <name> |
Delete a credential |
signet-eval log |
Recent action log |
signet-eval reset-session |
Clear spending counters |
signet-eval sign |
HMAC-sign policy file |
signet-eval serve |
MCP management server (17 tools) |
signet-eval proxy |
MCP proxy for upstream servers |
Performance
| Metric | Value |
|---|---|
| Hook eval (end-to-end) | 25ms — process spawn, stdin, JSON parse, policy load, eval, response |
| In-process policy eval | 14–63μs — 14μs deny, 21μs ask, 63μs spending check |
| CLI validate / rules | 8ms |
| Binary size | 6.2MB (stripped, LTO) |
Architecture
signet-eval is the enforcement layer of the Signet personal sovereign agent stack. The core principle: the authorization layer must not be an LLM. It processes structured data only — regex, comparisons, and vault queries. No natural language, no context window, no persuasion surface. A rule either matches or it doesn't.
Agent proposes action -> signet-eval evaluates policy -> allow / deny / ask
(deterministic, 25ms, no NLP)
Security Model
signet-eval is a seatbelt, not a cage. Understanding what it does and doesn't protect against is critical.
What it protects against
A cooperative agent making mistakes. Claude Code follows the hook protocol — it calls signet-eval before every tool use and respects the allow/deny/ask response. Within that protocol, signet-eval reliably enforces policy:
- Blocks destructive commands (
rm,mkfs, piped remote execution) - Enforces spending limits across tool calls
- Gates credential access with domain, purpose, and amount constraints
- Requires user confirmation for sensitive operations (
git push --force) - Prevents casual policy tampering via locked rules and MCP guards
This is the normal operating mode. The agent isn't trying to escape — it's trying to be helpful and the policy keeps it in bounds.
What it does NOT protect against
An adversarial agent with shell access. signet-eval runs at the same privilege level as the agent it constrains. Any process with the same UID can:
- Use shell indirection to evade string matching (
alias,eval,base64 -d | sh,python3 -c "...") - Create symlinks to protected paths under innocent names
- Read the session key file and forge HMAC signatures
- Replace the signet-eval binary itself
- Modify the hook configuration in
settings.json - Use
LD_PRELOAD,ptrace, or process injection
No amount of pattern matching fixes this. Adding rules for ln invites alias. Blocking alias invites eval. Blocking eval invites python3 -c. This is an infinite regress — the fundamental issue is privilege level, not pattern coverage.
Where real containment lives
Protecting against a truly adversarial agent requires OS-level controls that operate above the agent's privilege level:
- Separate user account — run the agent as a restricted user with no write access to policy files
- Container/sandbox — Docker, Flatpak, or
bubblewrapwith a read-only mount for~/.signet/ - Stripped
$PATH— removeln,alias, and other indirection tools from the agent's environment - seccomp / AppArmor / SELinux — mandatory access controls that the agent cannot override
- Virtual environment — run Claude Code in a constrained environment with limited filesystem access
signet-eval is the policy layer within such a setup. It handles the "what should this agent be allowed to do" question with clear, auditable rules. The OS handles the "can this agent circumvent the policy" question. Neither replaces the other.
Defense in depth
The layers work together:
| Layer | Protects against | Mechanism |
|---|---|---|
| String matching | Obvious mistakes, clear UX | Regex, substring, word-boundary conditions |
| Locked rules | Casual MCP-based policy tampering | Immutable rules, position protection |
| HMAC signing | Out-of-band file modification | Cryptographic integrity verification |
| OS controls | Privilege escalation, shell indirection | Sandboxing, RBAC, separate users |
Without OS controls, signet-eval is a speed bump, not a wall. With them, it's the policy engine inside a secure perimeter.
License
MIT