pmcp-code-mode
Code Mode validation and execution framework for MCP servers built on the PMCP SDK.
Enables LLM-generated code (GraphQL, JavaScript, SQL, MCP compositions) to be validated, explained, and executed with HMAC-signed approval tokens that cryptographically bind code to its validation result.
Status: v0.1.0 — migrated from
pmcp-run/built-in/shared/pmcp-code-modeinto the SDK workspace in Phase 67.1. The public API is stabilizing; feedback is welcome before the 1.0 contract is locked.
How It Works
┌──────────────┐
│ LLM Client │
└──────┬───────┘
│
1. describe_schema() <- schema exposed per exposure policy
│
2. LLM generates code (GraphQL, JS, SQL, MCP composition)
│
3. validate_code(code) ──────────────────────┐
│ │
┌─────────▼──────────┐ │
│ ValidationPipeline │ │
│ ┌───────────────┐ │ ┌────────────▼────────────┐
│ │ Parse │ │ │ PolicyEvaluator (Cedar, │
│ │ Security scan │ │────>│ AVP, or custom) │
│ │ Explain │ │ └─────────────────────────┘
│ │ HMAC sign │ │
│ └───────────────┘ │
└─────────┬──────────┘
│
approval_token (HMAC-SHA256 signed)
│
4. User reviews explanation, approves
│
5. execute_code(code, token) ────────────────┐
│ │
┌─────────▼──────────┐ ┌────────────▼──────┐
│ Token verification │ │ CodeExecutor impl │
│ (hash, expiry, sig)│────>│ (your backend) │
└────────────────────┘ └───────────────────┘
│
execution result (JSON)
The token ensures that the exact code the user approved is what gets executed — any modification after validation invalidates the token.
Supported Languages
The language attribute on #[derive(CodeMode)] selects the validation path at compile time. Each language maps to a feature-gated validation method on ValidationPipeline:
| Language | Derive Attribute | Validation Method | Feature Required |
|---|---|---|---|
| GraphQL | "graphql" (default) |
validate_graphql_query_async |
(none) |
| JavaScript | "javascript" or "js" |
validate_javascript_code |
openapi-code-mode |
| SQL | "sql" |
validate_sql_query |
sql-code-mode |
| MCP | "mcp" |
validate_mcp_composition |
mcp-code-mode |
The CodeLanguage enum in pmcp_code_mode::types is the runtime representation of these values. Unknown language strings produce a compile error at macro expansion time.
Quick Start
Minimal: Direct Pipeline Usage
All pipeline constructors return Result — invalid configuration (such as an HMAC secret shorter than 16 bytes) is caught at startup, not at runtime.
use ;
let config = enabled;
let secret = new;
let pipeline = from_token_secret?;
let ctx = new;
let result = pipeline.validate_graphql_query?;
assert!;
assert!; // HMAC-signed token
With Policy Evaluator
Wire a policy evaluator (Cedar, AWS Verified Permissions, or custom) into the pipeline for authorization checks between parsing and token signing:
use ;
use Arc;
let config = enabled;
let secret = new;
let evaluator = new; // Use a real evaluator in production
let pipeline = with_policy_evaluator?;
The policy evaluator is stored as Arc<dyn PolicyEvaluator>, enabling shared ownership across handlers and async tasks.
With #[derive(CodeMode)] (Recommended)
The derive macro eliminates ~80 lines of boilerplate per server and supports all four languages. See the pmcp-code-mode-derive README for the full derive guide.
GraphQL server (default):
use ;
use CodeMode;
use Arc;
JavaScript/OpenAPI server (Cost Coach, etc.):
SQL server:
All derive-generated servers share the same pattern: the language attribute selects the parser, the context_from method binds tokens to real user identity, and CodeExecutor handles your backend-specific execution.
Field name convention: The derive macro identifies required fields by fixed names. Missing any field produces a compile error listing all absent fields.
| Field Name | Type | Purpose |
|---|---|---|
code_mode_config |
CodeModeConfig |
Validation pipeline config |
token_secret |
TokenSecret |
HMAC signing secret |
policy_evaluator |
Arc<impl PolicyEvaluator> |
Authorization backend |
code_executor |
Arc<impl CodeExecutor> |
Your execution backend |
Implementing CodeExecutor
This is the only trait you need to implement. The executor holds its own configuration (timeouts, limits, etc.) — CodeExecutor::execute() is intentionally kept simple:
use ;
use Value;
For GraphQL and SQL servers, you implement CodeExecutor directly — your executor calls your database or GraphQL backend.
For JavaScript/OpenAPI, SDK, and MCP servers, use the standard adapters instead of implementing CodeExecutor manually.
Standard Adapters (JS/SDK/MCP)
These adapters bridge the low-level execution traits to CodeExecutor, eliminating ~75 lines of manual handler boilerplate per server. Each compiles JavaScript code via PlanCompiler, executes via PlanExecutor, and logs execution metadata automatically.
JsCodeExecutor<H> — JavaScript + HTTP calls (Pattern B). Requires js-runtime feature.
use ;
// Your HttpExecutor implementation (e.g., CostExplorerHttpExecutor)
let http = new;
let config = default
.with_blocked_fields;
let code_executor = new;
// Pass as code_executor field in your #[derive(CodeMode)] struct
SdkCodeExecutor<S> — JavaScript + SDK operations (Pattern C). Requires js-runtime feature.
use ;
let sdk = new;
let config = default;
let code_executor = new;
McpCodeExecutor<M> — JavaScript + MCP tool composition (Pattern D). Requires mcp-code-mode feature.
use ;
let mcp = new;
let config = default;
let code_executor = new;
All three adapters:
- Create a fresh
PlanCompiler+PlanExecutorper call (cheap — yourHttpExecutor/SdkExecutor/McpExecutorholdsArc'd state) - Forward
variablesinto the execution plan asargs(available in JS code as theargsvariable) - Log
api_callscount andexecution_time_msviatracing::debug!
End-to-End: Cost Coach with Derive Macro
Before (manual handlers, ~75 lines):
// ... implement ToolHandler for both, wire manually ...
After (derive macro + adapter, 8 lines):
let http = new;
let code_executor = new;
let server = new;
let builder = server.register_code_mode_tools?;
Key Types
| Type | What It Does |
|---|---|
ValidationPipeline |
Orchestrates: parse -> policy check -> security analysis -> explanation -> token |
CodeModeConfig |
Controls what's allowed: mutations, introspection, blocked fields, max depth, TTL |
CodeLanguage |
Enum of supported languages: GraphQL, JavaScript, Sql, Mcp |
PolicyEvaluator |
Trait for pluggable authorization (Cedar, AWS Verified Permissions, custom) |
CodeExecutor |
Trait for executing validated code against your backend |
JsCodeExecutor<H> |
Standard adapter: HttpExecutor -> CodeExecutor (JS+HTTP, js-runtime feature) |
SdkCodeExecutor<S> |
Standard adapter: SdkExecutor -> CodeExecutor (JS+SDK, js-runtime feature) |
McpCodeExecutor<M> |
Standard adapter: McpExecutor -> CodeExecutor (JS+MCP, mcp-code-mode feature) |
ExecutionConfig |
JS execution limits: max_api_calls, timeout_seconds, max_loop_iterations, blocked fields |
TokenSecret |
Zeroizing HMAC secret — backed by secrecy::SecretBox<[u8]>, no Debug/Clone/Serialize |
HmacTokenGenerator |
Creates HMAC-SHA256 tokens binding code hash + context to approval |
TokenError |
Error type for constructor failures (e.g. HMAC secret too short) |
ApprovalToken |
Signed token: code hash, user ID, session ID, expiry, risk level, context hash |
NoopPolicyEvaluator |
Test-only evaluator that allows everything — NOT for production |
ValidationResponse |
Handler-level response wrapping ValidationResult + auto-approval, action, code hash |
ExecutionConfig |
JS execution limits: max_api_calls, timeout_seconds, max_loop_iterations |
CodeModeHandler |
Server-side handler trait with tool builder, pre-handle hooks, soft-disable |
Configuration
CodeModeConfig controls the validation pipeline behavior:
let config = CodeModeConfig ;
Query and Mutation Authorization
The pipeline enforces config-level authorization checks before policy evaluation:
- Mutation control:
allow_mutations(global toggle),blocked_mutations(blocklist),allowed_mutations(allowlist). Ifallowed_mutationsis non-empty, only listed mutations pass. - Query control:
blocked_queries(blocklist),allowed_queries(allowlist). Same allowlist-takes-precedence semantics as mutations. - Policy evaluation: After config checks pass,
PolicyEvaluator::evaluate_operation()runs (if configured) for fine-grained authorization.
Feature Flags
| Feature | Default | What It Adds |
|---|---|---|
| (none) | yes | GraphQL validation via graphql-parser |
openapi-code-mode |
no | JavaScript/OpenAPI validation via SWC parser |
js-runtime |
no | JavaScript AST-based execution in pure Rust (implies openapi-code-mode) |
sql-code-mode |
no | SQL query validation and parameterization |
mcp-code-mode |
no | MCP-to-MCP tool composition (implies js-runtime) |
cedar |
no | Local Cedar policy evaluation via cedar-policy 4.9 |
Dependency chain: mcp-code-mode -> js-runtime -> openapi-code-mode
Security Design
See SECURITY.md for the full threat model.
Token security:
- HMAC-SHA256 binds: code hash + user ID + session ID + server ID + context hash + risk level + expiry
- Token TTL default: 5 minutes
- Code canonicalization prevents whitespace-based bypass
- Any code modification after validation invalidates the token
Secret handling:
TokenSecretbacked bysecrecy::SecretBox<[u8]>, zeroed on drop- Explicitly does not implement:
Debug,Display,Clone,Serialize,Deserialize,PartialEq - Minimum 16-byte secret enforced at construction —
HmacTokenGenerator::newreturnsResult<Self, TokenError>(no panic) - Access only via
expose_secret()— framework-internal, never needed by server code
Policy evaluation:
- Default-deny: without a configured
PolicyEvaluator, only basic config checks run - Policy evaluator stored as
Arc<dyn PolicyEvaluator>— shared safely across async handlers - Cedar support via
cedarfeature flag (local evaluation, no network) NoopPolicyEvaluatorfor tests only — prominently documented with warnings
Schema Exposure Architecture
The three-layer schema model controls what the LLM sees:
Full Schema -> Exposure Policy -> Derived Schema -> LLM
(filter/redact) (what the LLM sees)
ExposureMode::Full— expose everythingExposureMode::ReadOnly— expose reads, hide mutationsExposureMode::Allowlist— only specified operationsExposureMode::Custom— per-operation overrides viaToolOverride
Breaking Changes in v0.1.0
Constructors now return Result
All ValidationPipeline constructors and HmacTokenGenerator::new return Result instead of panicking on invalid input. This catches misconfiguration at startup.
// Before (v0.0.x):
let pipeline = new;
// After (v0.1.0):
let pipeline = new?;
Policy evaluator uses Arc (not Box)
with_policy_evaluator and set_policy_evaluator now accept Arc<dyn PolicyEvaluator> instead of Box<dyn PolicyEvaluator>. This enables shared ownership needed by the derive macro's generated handlers.
// Before:
pipeline.set_policy_evaluator;
// After:
pipeline.set_policy_evaluator;
language attribute selects validation path
#[code_mode(language = "...")] now dispatches to the correct language-specific validation method at compile time, not just tool metadata. Servers using JavaScript, SQL, or MCP can now use #[derive(CodeMode)] instead of manual handler structs.
See CHANGELOG.md for the full list of changes.
Known Limitations (v0.1.0)
-
TokenSecret::newdoes not zeroize the sourceVec. The bytes are copied intoSecretBoxbut the originalVecis not zeroed. UseTokenSecret::from_env()in production for maximum security. -
GraphQL only in default features. JavaScript/OpenAPI validation requires the
openapi-code-modefeature flag and pulls in SWC (~25MB compile artifact). -
No server-side token revocation. Tokens are stateless (verified by HMAC). Once issued, a token is valid until it expires. Short TTL (5 min default) mitigates this.
-
JavaScript validation is sync only.
validate_javascript_codeis synchronous (no async variant). The derive macro handles this transparently — the generated async handler calls the sync method without.await. -
SQL and MCP validators are stub. The
validate_sql_queryandvalidate_mcp_compositionmethods require their respective feature flags. These validators are being implemented — the derive macro dispatch is ready.
Crate Dependencies
Minimal in the default feature set:
graphql-parser 0.4 — GraphQL parsing (pure Rust, no proc macros)
hmac 0.13 + sha2 0.11 — HMAC-SHA256 token signing
secrecy 0.10 — Secret memory management
zeroize 1.8 — Memory zeroing on drop
chrono 0.4 — Token timestamps
hex 0.4 — Hash encoding
base64 0.22 — Token encoding
serde + serde_json — Serialization
thiserror — Error types
async-trait — Async trait support
The cedar feature adds cedar-policy 4.9 (~3MB). The openapi-code-mode feature adds SWC.
Running the Example
This demonstrates the full validate -> approve -> execute round trip, including a rejection path for blocked mutations.
Feedback Welcome
This is a pre-1.0 API. Key areas where we'd like team input:
- Standard adapters — does
JsCodeExecutor/SdkCodeExecutor/McpCodeExecutorcover your execution pattern, or do you need a different adapter shape? - Variables forwarding — the adapters pass
variablesas theargsvariable in JS plans. Does your server need a different variable binding strategy? - Derive macro ergonomics — are the fixed field names (
code_mode_config,token_secret, etc.) workable, or do you need attribute-based field mapping? context_frompattern — does returningValidationContextfrom a sync method work for your auth integration, or do you need an async version?- SQL validation — what SQL dialects do you need? Parameterized queries, prepared statements, or raw SQL only?
- MCP composition — what should
validate_mcp_compositioncheck? Schema compatibility, tool existence, or structural validation? - Policy evaluation — any use cases beyond Cedar and AVP?
File issues or discuss in the #pmcp-sdk channel.
License
MIT