execkit
Stateful, structured, safe command execution for AI agents - over local shells, SSH, and Docker.
Early
0.xrelease - not production-ready. See Limitations.
execkit gives an AI agent a persistent session on a machine - a local shell, an SSH host, or a Docker container - and returns a structured result for every command. Crucially, it treats the agent itself as untrusted: every command passes a policy fence, output is scrubbed of secrets, and flooding output is bounded. Use it as an embeddable Rust library or as an MCP server any agent can drive.
Why
Letting an autonomous agent run shell commands is useful but risky: built-in agent shells are local-only with no guardrails, managed sandboxes lock you in, and raw SSH is stateless-per-command with no notion of "is this command allowed?"
The agent is the adversary. The LLM driving execkit can be prompt-injected by anything it reads, so execkit contains its own caller: a command passes the policy fence before it runs, secrets are redacted before output returns, and a changed SSH host key fails loudly instead of reconnecting into a MITM.
flowchart LR
A([AI agent]) -->|command| F{policy fence}
F -->|blocked| X([rejected, never runs])
F -->|allowed| T[transport: local / SSH / Docker]
T --> O[raw output]
O --> R[redact secrets, bound output]
R --> E([structured ExecResult])
E -.-> A
Use it from an agent (MCP)
Install the server - no Rust toolchain needed:
# pip (the server binary ships as a wheel):
# ...or a prebuilt binary (Linux/macOS, x86_64 + arm64):
|
# ...or with cargo:
Point your MCP client at it (claude mcp add execkit -- execkit-mcp, or a config block):
The agent gets session_create (local, ssh, or docker) -> session_exec ->
session_destroy, plus session_checkpoint/session_restore for remote undo.
session_exec returns a structured ExecResult (split stdout/stderr, exit code,
cwd), already secret-redacted and bounded.
State persists across calls, and every result is parsed - not scraped from a terminal:
// session_exec {"command": "cd /app && npm ci"} -> { "exit_code": 0, "cwd": "/app" }
// session_exec {"command": "npm run build"} // cwd is still /app
// -> { "stderr": "Error: Cannot find module 'webpack'",
// "exit_code": 1, "duration_ms": 3420, "cwd": "/app", "truncated": false }
See crates/execkit-mcp/README.md for the operator
security settings (host-key verification, key dir, audit, session limits).
Use it as a library
[]
= "0.6" # local + SSH + Docker
# execkit = { version = "0.6", default-features = false } # local + Docker only (no SSH; no russh/tokio)
use ;
Runnable examples: cargo run --example local,
EXECKIT_SSH="user:password@host:22" cargo run --example ssh, and
EXECKIT_DOCKER=<container> cargo run --example docker.
Python
The same sessions from Python - pip install execkit (native bindings, no Rust
toolchain needed):
=
See crates/execkit-py/README.md.
What you get
- Persistent, stateful sessions -
cd/env/state persist across commands, over local PTY, SSH, or Docker. - Structured
ExecResult- split stdout/stderr, exit code, duration, cwd. - Safe by construction - advisory command policy, secret redaction, bounded (anti-flood) output, SSH host-key verification.
- One small API, every transport - the same
ExecResultregardless of transport. - Embeddable, never a service -
cargo add, in your process; no daemon, no vendor. - Undo for agent actions - on remote sessions, snapshot the workspace and
restore files if a command goes wrong (requires
giton the remote and an explicit workspace; files only, not side effects). - Output budgets - shape any command's output so huge logs do not blow the
agent's context:
tail/head/head+tailby line, agrepfilter with context, and a char cap. Per-call or a session default; the result reports what was kept.
Limitations
An early library - today:
- Not a sandbox. The command policy is an advisory tripwire (string-matching, bypassable). The load-bearing control is a least-privilege environment - run the agent and SSH user with minimal rights.
- A timed-out command poisons the session - you get a clear error and should create a new session.
- Unix-only. Local sessions need a POSIX shell (
bash); Windows is later. - Synchronous core - fine for typical agent use; not tuned for thousands of concurrent sessions.
- SSH
AcceptAnyhost-key mode exists for testing, behind an explicit insecure opt-in - never use it in production.
Found something rough? Open an issue.
Contributing & security
- Contributions: see
CONTRIBUTING.md. - Found a vulnerability? Follow
SECURITY.md- please don't open a public issue for security reports.
License
Apache-2.0 - embed it freely, including commercially. See LICENSE and
NOTICE.