pmcp-code-mode
Code Mode validation and execution framework for MCP servers built on the PMCP SDK.
Enables LLM-generated code (GraphQL, JavaScript, SQL, MCP compositions) to be validated, explained, and executed with HMAC-signed approval tokens that cryptographically bind code to its validation result.
Status: v0.3.0 — multi-language validation with policy enforcement, standard adapters, deploy-time config. The public API is stabilizing; feedback is welcome before the 1.0 contract is locked.
How It Works
┌──────────────┐
│ LLM Client │
└──────┬───────┘
│
1. describe_schema() <- schema exposed per exposure policy
│
2. LLM generates code (GraphQL, JS, SQL, MCP composition)
│
3. validate_code(code) ──────────────────────┐
│ │
┌─────────▼──────────┐ │
│ ValidationPipeline │ │
│ ┌───────────────┐ │ ┌────────────▼────────────┐
│ │ Parse │ │ │ PolicyEvaluator (Cedar, │
│ │ Security scan │ │────>│ AVP, or custom) │
│ │ Explain │ │ └─────────────────────────┘
│ │ HMAC sign │ │
│ └───────────────┘ │
└─────────┬──────────┘
│
approval_token (HMAC-SHA256 signed)
│
4. User reviews explanation, approves
│
5. execute_code(code, token) ────────────────┐
│ │
┌─────────▼──────────┐ ┌────────────▼──────┐
│ Token verification │ │ CodeExecutor impl │
│ (hash, expiry, sig)│────>│ (your backend) │
└────────────────────┘ └───────────────────┘
│
execution result (JSON)
The token ensures that the exact code the user approved is what gets executed — any modification after validation invalidates the token.
Supported Languages
The language attribute on #[derive(CodeMode)] selects the validation path at compile time. Each language maps to a feature-gated validation method on ValidationPipeline:
| Language | Derive Attribute | Validation Method | Feature Required |
|---|---|---|---|
| GraphQL | "graphql" (default) |
validate_graphql_query_async |
(none) |
| JavaScript | "javascript" or "js" |
validate_javascript_code |
openapi-code-mode |
| SQL | "sql" |
validate_sql_query |
sql-code-mode |
| MCP | "mcp" |
validate_mcp_composition |
mcp-code-mode |
The CodeLanguage enum in pmcp_code_mode::types is the runtime representation of these values. Unknown language strings produce a compile error at macro expansion time.
Quick Start
Minimal: Direct Pipeline Usage
All pipeline constructors return Result — invalid configuration (such as an HMAC secret shorter than 16 bytes) is caught at startup, not at runtime.
use ;
let config = enabled;
let secret = new;
let pipeline = from_token_secret?;
let ctx = new;
let result = pipeline.validate_graphql_query?;
assert!;
assert!; // HMAC-signed token
With Policy Evaluator
Wire a policy evaluator (Cedar, AWS Verified Permissions, or custom) into the pipeline for authorization checks between parsing and token signing:
use ;
use Arc;
let config = enabled;
let secret = new;
let evaluator = new; // Use a real evaluator in production
let pipeline = with_policy_evaluator?;
The policy evaluator is stored as Arc<dyn PolicyEvaluator>, enabling shared ownership across handlers and async tasks.
With #[derive(CodeMode)] (Recommended)
The derive macro eliminates ~80 lines of boilerplate per server and supports all four languages. See the pmcp-code-mode-derive README for the full derive guide.
GraphQL server (default):
use ;
use CodeMode;
use Arc;
JavaScript/OpenAPI server (Cost Coach, etc.):
SQL server:
All derive-generated servers share the same pattern: the language attribute selects the parser, the context_from method binds tokens to real user identity, and CodeExecutor handles your backend-specific execution.
Field name convention: The derive macro identifies required fields by fixed names. Missing any field produces a compile error listing all absent fields.
| Field Name | Type | Purpose |
|---|---|---|
code_mode_config |
CodeModeConfig |
Validation pipeline config |
token_secret |
TokenSecret |
HMAC signing secret |
policy_evaluator |
Arc<impl PolicyEvaluator> |
Authorization backend |
code_executor |
Arc<impl CodeExecutor> |
Your execution backend |
Implementing CodeExecutor
This is the only trait you need to implement. The executor holds its own configuration (timeouts, limits, etc.) — CodeExecutor::execute() is intentionally kept simple:
use ;
use Value;
For GraphQL and SQL servers, you implement CodeExecutor directly — your executor calls your database or GraphQL backend.
For JavaScript/OpenAPI, SDK, and MCP servers, use the standard adapters instead of implementing CodeExecutor manually.
Standard Adapters (JS/SDK/MCP)
These adapters bridge the low-level execution traits to CodeExecutor, eliminating ~75 lines of manual handler boilerplate per server. Each compiles JavaScript code via PlanCompiler, executes via PlanExecutor, and logs execution metadata automatically.
JsCodeExecutor<H> — JavaScript + HTTP calls (Pattern B). Requires js-runtime feature.
use ;
// Your HttpExecutor implementation (e.g., CostExplorerHttpExecutor)
let http = new;
let config = default
.with_blocked_fields;
let code_executor = new;
// Pass as code_executor field in your #[derive(CodeMode)] struct
SdkCodeExecutor<S> — JavaScript + SDK operations (Pattern C). Requires js-runtime feature.
use ;
let sdk = new;
let config = default;
let code_executor = new;
McpCodeExecutor<M> — JavaScript + MCP tool composition (Pattern D). Requires mcp-code-mode feature.
use ;
let mcp = new;
let config = default;
let code_executor = new;
All three adapters:
- Create a fresh
PlanCompiler+PlanExecutorper call (cheap — yourHttpExecutor/SdkExecutor/McpExecutorholdsArc'd state) - Forward
variablesinto the execution plan asargs(available in JS code as theargsvariable) - Log
api_callscount andexecution_time_msviatracing::debug!
End-to-End: Cost Coach with Derive Macro
Before (manual handlers, ~75 lines):
// ... implement ToolHandler for both, wire manually ...
After (derive macro + adapter, 8 lines):
let http = new;
let code_executor = new;
let server = new;
let builder = server.register_code_mode_tools?;
Key Types
| Type | What It Does |
|---|---|
ValidationPipeline |
Orchestrates: parse -> policy check -> security analysis -> explanation -> token |
CodeModeConfig |
Controls what's allowed: mutations, introspection, blocked fields, max depth, TTL |
CodeLanguage |
Enum of supported languages: GraphQL, JavaScript, Sql, Mcp |
PolicyEvaluator |
Trait for pluggable authorization (Cedar, AWS Verified Permissions, custom) |
CodeExecutor |
Trait for executing validated code against your backend |
JsCodeExecutor<H> |
Standard adapter: HttpExecutor -> CodeExecutor (JS+HTTP, js-runtime feature) |
SdkCodeExecutor<S> |
Standard adapter: SdkExecutor -> CodeExecutor (JS+SDK, js-runtime feature) |
McpCodeExecutor<M> |
Standard adapter: McpExecutor -> CodeExecutor (JS+MCP, mcp-code-mode feature) |
ExecutionConfig |
JS execution limits: max_api_calls, timeout_seconds, max_loop_iterations, blocked fields |
TokenSecret |
Zeroizing HMAC secret — backed by secrecy::SecretBox<[u8]>, no Debug/Clone/Serialize |
HmacTokenGenerator |
Creates HMAC-SHA256 tokens binding code hash + context to approval |
TokenError |
Error type for constructor failures (e.g. HMAC secret too short) |
ApprovalToken |
Signed token: code hash, user ID, session ID, expiry, risk level, context hash |
NoopPolicyEvaluator |
Test-only evaluator that allows everything — NOT for production |
ValidationResponse |
Handler-level response wrapping ValidationResult + auto-approval, action, code hash |
ExecutionConfig |
JS execution limits: max_api_calls, timeout_seconds, max_loop_iterations |
CodeModeHandler |
Server-side handler trait with tool builder, pre-handle hooks, soft-disable |
Configuration
CodeModeConfig controls the validation pipeline behavior:
let config = CodeModeConfig ;
Query and Mutation Authorization
The pipeline enforces config-level authorization checks before policy evaluation:
- Mutation control:
allow_mutations(global toggle),blocked_mutations(blocklist),allowed_mutations(allowlist). Ifallowed_mutationsis non-empty, only listed mutations pass. - Query control:
blocked_queries(blocklist),allowed_queries(allowlist). Same allowlist-takes-precedence semantics as mutations. - Policy evaluation: After config checks pass,
PolicyEvaluator::evaluate_operation()runs (if configured) for fine-grained authorization.
Deployment Configuration (config.toml)
When deploying with cargo pmcp deploy, the server's config.toml is automatically included in the deploy ZIP. The pmcp.run platform extracts operation metadata from this file to populate the Code Mode policy page in the admin UI — administrators can then enable/disable individual operations by category.
config.toml Schema
The [[code_mode.operations]] section declares available operations. When present, it takes priority over [[tools]] for policy categorization.
OpenAPI server:
[]
= "cost-coach"
= "openapi-api"
[]
= false
= false
[[]]
= "getCostAndUsage"
= "Retrieve AWS cost and usage data"
= "/ce/GetCostAndUsage"
= "POST"
[[]]
= "getRecommendations"
= "Get cost optimization recommendations"
= "/ce/GetRightsizingRecommendation"
= "POST"
[[]]
= "deleteBudget"
= "Delete a budget"
= "/budgets/DeleteBudget"
= "POST"
= true
GraphQL server:
[]
= "open-images"
= "graphql-api"
[]
= false
[[]]
= "searchImages"
= "query"
= "Search the image catalog"
[[]]
= "createCollection"
= "mutation"
= "Create a new image collection"
[[]]
= "deleteImage"
= "mutation"
= "Permanently delete an image"
= true
SQL server:
[]
= "analytics"
= "sql"
[]
= true
= false
= ["audit_log", "credentials"]
[]
[[]]
= "orders"
= "Customer order history"
[[]]
= "products"
= "Product catalog"
MCP composition server:
[]
= "orchestrator"
= "mcp-api"
[[]]
= "analyze_costs"
= "Multi-step cost analysis workflow"
= "read"
[[]]
= "provision_resources"
= "Provision cloud resources"
= "admin"
Categorization Rules
Operations are automatically categorized based on server type:
| Server Type | Read | Write | Delete | Admin |
|---|---|---|---|---|
| OpenAPI | GET, HEAD, OPTIONS | POST, PUT, PATCH | DELETE | operation_category = "admin" |
| GraphQL | query | mutation | mutation with destructive_hint or delete/remove/destroy prefix |
operation_category = "admin" |
| SQL | SELECT on non-blocked tables | INSERT, UPDATE (if allow_writes) |
DELETE (if allow_deletes) |
— |
| MCP-API | read_only_hint / default |
create/add/update/set name patterns | delete/remove/destroy patterns | operation_category = "admin" |
The operation_category field overrides automatic categorization when set explicitly.
Config File Resolution
cargo pmcp deploy finds the config file using this resolution order:
config.tomlin the server crate root- Single
.tomlfile ininstances/directory
The same file the server embeds via include_str!() in main.rs.
Feature Flags
| Feature | Default | What It Adds |
|---|---|---|
| (none) | yes | GraphQL validation via graphql-parser |
openapi-code-mode |
no | JavaScript/OpenAPI validation via SWC parser |
js-runtime |
no | JavaScript AST-based execution in pure Rust (implies openapi-code-mode) |
sql-code-mode |
no | SQL query validation and parameterization |
mcp-code-mode |
no | MCP-to-MCP tool composition (implies js-runtime) |
cedar |
no | Local Cedar policy evaluation via cedar-policy 4.9 |
Dependency chain: mcp-code-mode -> js-runtime -> openapi-code-mode
Security Design
See SECURITY.md for the full threat model.
Token security:
- HMAC-SHA256 binds: code hash + user ID + session ID + server ID + context hash + risk level + expiry
- Token TTL default: 5 minutes
- Code canonicalization prevents whitespace-based bypass
- Any code modification after validation invalidates the token
Secret handling:
TokenSecretbacked bysecrecy::SecretBox<[u8]>, zeroed on drop- Explicitly does not implement:
Debug,Display,Clone,Serialize,Deserialize,PartialEq - Minimum 16-byte secret enforced at construction —
HmacTokenGenerator::newreturnsResult<Self, TokenError>(no panic) - Access only via
expose_secret()— framework-internal, never needed by server code
Policy evaluation:
- Default-deny: without a configured
PolicyEvaluator, only basic config checks run - Policy evaluator stored as
Arc<dyn PolicyEvaluator>— shared safely across async handlers - Both GraphQL and JavaScript validation call their respective policy evaluation methods (
evaluate_operation/evaluate_script) — fail-closed on policy errors - Cedar support via
cedarfeature flag (local evaluation, no network) - AVP (AWS Verified Permissions) support via external evaluator — policies configured in pmcp.run admin UI
NoopPolicyEvaluatorfor tests only — prominently documented with warnings
Schema Exposure Architecture
The three-layer schema model controls what the LLM sees:
Full Schema -> Exposure Policy -> Derived Schema -> LLM
(filter/redact) (what the LLM sees)
ExposureMode::Full— expose everythingExposureMode::ReadOnly— expose reads, hide mutationsExposureMode::Allowlist— only specified operationsExposureMode::Custom— per-operation overrides viaToolOverride
Breaking Changes in v0.1.0
Constructors now return Result
All ValidationPipeline constructors and HmacTokenGenerator::new return Result instead of panicking on invalid input. This catches misconfiguration at startup.
// Before (v0.0.x):
let pipeline = new;
// After (v0.1.0):
let pipeline = new?;
Policy evaluator uses Arc (not Box)
with_policy_evaluator and set_policy_evaluator now accept Arc<dyn PolicyEvaluator> instead of Box<dyn PolicyEvaluator>. This enables shared ownership needed by the derive macro's generated handlers.
// Before:
pipeline.set_policy_evaluator;
// After:
pipeline.set_policy_evaluator;
language attribute selects validation path
#[code_mode(language = "...")] now dispatches to the correct language-specific validation method at compile time, not just tool metadata. Servers using JavaScript, SQL, or MCP can now use #[derive(CodeMode)] instead of manual handler structs.
Breaking Changes in v0.3.0
JavaScript derive macro now calls async validation with policy enforcement
#[derive(CodeMode)] with language = "javascript" now calls validate_javascript_code_async instead of the sync validate_javascript_code. This means:
- Cedar policies are now enforced for JavaScript servers using the derive macro
- AVP policies are now enforced when deployed with
POLICY_STORE_IDon pmcp.run - Policy evaluation failures are fail-closed (same as GraphQL) — a policy backend outage blocks requests rather than silently allowing them
If your JavaScript server was relying on the absence of policy enforcement (e.g., using a custom PolicyEvaluator that only implemented evaluate_operation but not evaluate_script), the default evaluate_script implementation denies all scripts. Override evaluate_script in your evaluator to allow scripts, or use NoopPolicyEvaluator for testing.
Standard adapters added
JsCodeExecutor, SdkCodeExecutor, and McpCodeExecutor are new. They don't break existing code, but if you were manually implementing CodeExecutor for JS plan execution, you can now replace ~75 lines of boilerplate with:
let code_executor = new;
See CHANGELOG.md for the full list of changes.
Known Limitations (v0.1.0)
-
TokenSecret::newdoes not zeroize the sourceVec. The bytes are copied intoSecretBoxbut the originalVecis not zeroed. UseTokenSecret::from_env()in production for maximum security. -
GraphQL only in default features. JavaScript/OpenAPI validation requires the
openapi-code-modefeature flag and pulls in SWC (~25MB compile artifact). -
No server-side token revocation. Tokens are stateless (verified by HMAC). Once issued, a token is valid until it expires. Short TTL (5 min default) mitigates this.
-
SQL and MCP validators are stub. The
validate_sql_queryandvalidate_mcp_compositionmethods require their respective feature flags. These validators are being implemented — the derive macro dispatch is ready.
Crate Dependencies
Minimal in the default feature set:
graphql-parser 0.4 — GraphQL parsing (pure Rust, no proc macros)
hmac 0.13 + sha2 0.11 — HMAC-SHA256 token signing
secrecy 0.10 — Secret memory management
zeroize 1.8 — Memory zeroing on drop
chrono 0.4 — Token timestamps
hex 0.4 — Hash encoding
base64 0.22 — Token encoding
serde + serde_json — Serialization
thiserror — Error types
async-trait — Async trait support
The cedar feature adds cedar-policy 4.9 (~3MB). The openapi-code-mode feature adds SWC.
Running the Example
This demonstrates the full validate -> approve -> execute round trip, including a rejection path for blocked mutations.
Feedback Welcome
This is a pre-1.0 API. Key areas where we'd like team input:
- Standard adapters — does
JsCodeExecutor/SdkCodeExecutor/McpCodeExecutorcover your execution pattern, or do you need a different adapter shape? - Variables forwarding — the adapters pass
variablesas theargsvariable in JS plans. Does your server need a different variable binding strategy? - Derive macro ergonomics — are the fixed field names (
code_mode_config,token_secret, etc.) workable, or do you need attribute-based field mapping? context_frompattern — does returningValidationContextfrom a sync method work for your auth integration, or do you need an async version?- SQL validation — what SQL dialects do you need? Parameterized queries, prepared statements, or raw SQL only?
- MCP composition — what should
validate_mcp_compositioncheck? Schema compatibility, tool existence, or structural validation? - Policy evaluation — any use cases beyond Cedar and AVP?
File issues or discuss in the #pmcp-sdk channel.
License
MIT