Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
magikrun
Unified OCI Runtime for Containers, WebAssembly, and MicroVMs
magikrun is a pure Rust OCI-compliant container runtime abstraction layer that provides:
- OCI Runtime Spec compliance for single-container operations (
create→start→kill→delete) - Pod Runtime Interface (PRI) for atomic pod deployment across heterogeneous isolation backends
- Three isolation backends: Native containers (Linux), MicroVMs (KVM/HVF), WebAssembly (cross-platform)
- Security-first design with comprehensive input validation, size limits, and timeout enforcement
From CRI to PRI: Evolution of Container Interfaces
The CRI Problem
Kubernetes introduced the Container Runtime Interface (CRI) to decouple from Docker. CRI treats pods as composite structures built step-by-step:
CRI Workflow:
─────────────────────────────────────────────────────────────────────
RunPodSandbox() → Creates pause container + namespaces
CreateContainer(A) → Prepares container A (not running)
StartContainer(A) → Runs container A
CreateContainer(B) → Prepares container B
StartContainer(B) → Runs container B
...wait for user/health checks...
StopContainer(B) → Graceful stop
StopContainer(A) → Graceful stop
RemoveContainer(B) → Cleanup container B
RemoveContainer(A) → Cleanup container A
StopPodSandbox() → Stop pause container
RemovePodSandbox() → Delete namespace holder
─────────────────────────────────────────────────────────────────────
11+ API calls, intermediate failure states possible
Problems with CRI:
- Partial failure states: What if
StartContainer(B)fails afterAis running? - Complex rollback logic: Kubelet must track and undo partial progress
- Race conditions: Namespace sharing during step-by-step creation
- Overhead: Multiple RPC round-trips per pod
- Impedance mismatch: Pods are the scheduling unit, but CRI operates on containers
The PRI Solution
Pod Runtime Interface (PRI) treats pods as the atomic unit of deployment:
PRI Workflow:
─────────────────────────────────────────────────────────────────────
run_pod(spec) → RUNNING (all containers) or ERROR (nothing)
...pod lifecycle...
stop_pod(id, grace) → Graceful shutdown
delete_pod(id) → Complete cleanup
─────────────────────────────────────────────────────────────────────
3 operations, no intermediate states
PRI Properties:
- Atomic: Either all containers start or none do (automatic rollback)
- Stateless: No orphaned resources on failure
- Immutable: Replace pods, don't repair them
- Self-healing friendly: Failed pod = delete + reschedule (simple!)
Why Now?
Several factors make PRI the right approach today:
- MicroVMs: Hardware VM isolation (KVM, Hypervisor.framework) provides natural atomicity—VM boot is all-or-nothing
- WASM: WebAssembly runtimes create isolated stores atomically
- GitOps/Immutable Infrastructure: Pods are cattle, not pets—replace don't repair
- Edge/IoT: Simpler semantics reduce resource overhead
- Security: Fewer intermediate states = smaller attack surface
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ magikrun │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Pod Runtime Interface (PRI) │ │
│ │ run_pod(spec) → stop_pod(id) → delete_pod(id) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ NativePod │ │ MicroVmPod │ │ WasmPod │ │ │
│ │ │ Runtime │ │ Runtime │ │ Runtime │ │ │
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │
│ │ │ │ │ │ │
│ └──────────┼────────────────┼────────────────┼──────────────────┘ │
│ │ uses │ uses │ uses │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ OCI Runtime (Single Container) │ │
│ │ create(id, bundle) → start → kill → delete │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌───────────────────────┐ │ │
│ │ │NativeRuntime │ │ KrunRuntime │ │ WasmtimeRuntime │ │ │
│ │ │ (youki) │ │ (libkrun) │ │ (wasmtime) │ │ │
│ │ └──────────────┘ └──────────────┘ └───────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────────┐ │
│ │ Image & Bundle Building │ │
│ │ OCI Registry → Layers → Rootfs + config.json │ │
│ │ Path traversal │ Size limits │ TOCTOU-safe whiteouts │ │
│ │ Symlink depth │ Hardlink validation │ Null-byte rejection │ │
│ └───────────────────────────┼───────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────────┐ │
│ │ Content-Addressed Storage │ │
│ │ SHA-256 verification │ Deduplication │ Atomic writes │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Infra Framework │ │
│ │ Infra + InfraExtension (workplane, mesh, service discovery) │ │
│ │ Runs inside infra-container (same code: native & MicroVM) │ │
│ └───────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Layer Relationships
| Layer | Uses | Purpose |
|---|---|---|
| PRI (pod module) | OCI Runtime | Atomic pod lifecycle with rollback |
| OCI Runtime (runtime module) | Bundle Builder, Storage | Single container lifecycle |
| Image & Bundle (image module) | Storage, Registry | Pull images, build OCI bundles |
| Storage | Filesystem | Content-addressed blob storage with SHA-256 verification |
| Infra (infra module) | — | Framework for infra-container binaries with extension model |
Module Structure
| Module | Description |
|---|---|
image |
CRI ImageService pattern: pull images, build bundles (OciRuntime, Wasm, MicroVm) |
runtime |
OCI RuntimeService: single-container lifecycle (create/start/kill/delete/state) |
pod |
PRI: Atomic pod lifecycle with PodRuntime trait and exec/logs support |
infra |
Infra-container framework with extension model for workplane, mesh, service discovery |
OCI Runtimes (Single Container)
| Runtime | Platform | Backend | Isolation Level | Use Case |
|---|---|---|---|---|
NativeRuntime |
Linux | youki (libcontainer/libcgroups) | Namespaces + cgroups v2 | Standard Linux containers |
KrunRuntime |
Linux/macOS | libkrun (krun-sys) | Hardware VM (KVM/HVF) | Untrusted workloads |
WasmtimeRuntime |
All | wasmtime + wasmtime-wasi | WASM sandbox + fuel limits | Portable plugins, edge |
These are the building blocks that PRI Pod Runtimes compose for atomic pod operations.
OCI Runtime Operations
| Operation | Input | Effect |
|---|---|---|
create |
container ID, bundle path | Sets up container without starting |
start |
container ID | Executes the container process |
state |
container ID | Returns current container state |
kill |
container ID, signal | Sends signal to container process |
delete |
container ID | Removes container resources |
Pod Runtimes (PRI)
| Runtime | Platform | Backend | Isolation | Atomicity |
|---|---|---|---|---|
NativePodRuntime |
Linux | youki + pasta | Namespaces + cgroups v2 | Emulated (rollback) |
MicroVmPodRuntime |
Linux/macOS | libkrun | Hardware VM (KVM/HVF) | Natural (VM boot) |
WasmPodRuntime |
All | wasmtime | WASM sandbox | Natural (store) |
Atomicity Explained
Native Pods (emulated atomicity):
1. Spawn pasta (creates network namespace, port forwarding)
2. pasta spawns infra-binary (holds namespaces, runs extensions)
3. Infra-binary writes PID file when ready
4. Start app containers (join infra namespaces)
5. If any step fails → rollback all previous steps
6. Return success only when ALL running
MicroVM Pods (natural atomicity):
1. Build composite rootfs with all container images
2. Bake pod spec into VM filesystem
3. Boot VM (vminit spawns containers)
4. VM either boots completely or fails → nothing to clean up
WASM Pods (natural atomicity):
1. Create wasmtime Store (isolated memory)
2. Load all modules
3. Link and instantiate
4. If any fails → store is dropped, no leaks
Bundle Formats
Different runtime backends require different bundle formats:
| Format | Contents | Runtime Backend | Use Case |
|---|---|---|---|
Bundle::OciRuntime |
rootfs/ + config.json |
NativeRuntime |
Standard Linux containers |
Bundle::Wasm |
module.wasm + WASI config |
WasmtimeRuntime |
WebAssembly modules |
Bundle::MicroVm |
rootfs/ + command/env |
KrunRuntime |
MicroVM guests |
The BundleBuilder automatically selects the appropriate format based on the target runtime.
MicroVM Networking
MicroVMs use passt internally for TCP/UDP/ICMP networking. Port mappings are configured via PodSpec or OciContainerConfig:
┌─────────────────────────────────────────────────────────────────┐
│ Host │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ MicroVmPodRuntime │───▶│ passt │ │
│ │ - Control channel │ │ - TCP forwarding │ │
│ │ - exec/logs │ │ - UDP forwarding │ │
│ └──────────┬──────────┘ │ - ICMP (ping) │ │
│ │ └──────────┬──────────┘ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ MicroVM (libkrun) │ │
│ │ ┌────────────────────────────────────────────────────┐ │ │
│ │ │ vminit (PID 1) - Listens on TCP :1024 │ │ │
│ │ │ - Spawns containers from /containers/*/ │ │ │
│ │ │ - Control protocol: exec, logs, ping │ │ │
│ │ └────────────────────────────────────────────────────┘ │ │
│ │ virtio-net (eth0) - Transparent TCP/UDP/ICMP │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
The Infra-Container
Every native pod has an infra-container that:
- Holds Linux namespaces (network, IPC, UTS) for other containers to join
- Runs workplane extensions (service discovery, Raft consensus, mesh)
- Is spawned by
pastawhich also handles port forwarding
2-Process Architecture
Native pods use a simplified 2-process model:
┌─────────────────────────────────────────────────────────────────┐
│ Pod │
│ │
│ pasta (creates netns, port forwarding) │
│ └─► infra-binary (external, e.g., workplane) │
│ ├─ Infra instance (from magikrun) │
│ ├─ Extensions (from external crate) │
│ └─ holds namespaces for app containers │
│ ▲ ▲ ▲ │
│ │ (join) │ (join) │ (join) │
│ ┌────┴───┐ ┌────┴───┐ ┌────┴───┐ │
│ │ App A │ │ App B │ │ App C │ │
│ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────────────────────────────┘
Benefits:
- Correct dependency semantics (pasta parent → infra dies if pasta dies)
- Simpler process tree than traditional pause container approach
- Extensions run inside infra binary
Extension Model
Extensions implement InfraExtension and hook into container lifecycle:
use ;
Symmetric Design
The same infra-container code runs in both native and MicroVM modes:
Native Mode: MicroVM Mode:
───────────────────────── ─────────────────────────────────────
┌─────────────────────────┐ ┌─────────────────────────────────────┐
│ Pod │ │ VM │
│ ┌───────────────────┐ │ │ vminit (PID 1) │
│ │ infra-container │ │ │ │ spawns containers │
│ │ (workplane) │ │ │ │ reaps zombies │
│ └───────────────────┘ │ │ │ forwards signals │
│ ┌───────────────────┐ │ │ │ │
│ │ app-container │ │ │ ├─► infra-container (SAME CODE) │
│ └───────────────────┘ │ │ └─► app-container │
└─────────────────────────┘ └─────────────────────────────────────┘
Content-Addressed Storage
OCI layers are stored by cryptographic digest for deduplication and integrity:
~/.magikrun/blobs/
└── sha256/
├── ab/
│ ├── abcd1234... (blob content)
│ └── ab9f8e7d... (blob content)
└── cd/
└── cdef5678... (blob content)
Security properties:
- Digest verification: Content hash verified before storage (prevents cache poisoning)
- Atomic writes: Temp file + rename pattern prevents partial/corrupted blobs
- Path validation: Only hexadecimal characters allowed in digests
- Deduplication: Same layer shared across images = one copy on disk
Security Model
Isolation Hierarchy (Defense-in-Depth)
┌───────────────────────────────────────────────────┐
│ MicroVM │ ← Hardware VM boundary
│ ┌─────────────────────────────────────────────┐ │
│ │ Native Container │ │ ← Kernel namespace boundary
│ │ ┌───────────────────────────────────────┐ │ │
│ │ │ WASM │ │ │ ← Language sandbox boundary
│ │ └───────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────┘
| Runtime | Isolation Level | Attack Surface | Use Case |
|---|---|---|---|
MicroVmPodRuntime |
Hardware VM | Minimal (VMM only) | Untrusted workloads |
NativePodRuntime |
Kernel namespaces | Kernel syscalls | Multi-tenant pods |
WasmPodRuntime |
WASM sandbox | WASI only | Portable plugins |
Runtime Resource Limits
The NativeRuntime enforces resource limits at container creation time:
- Container limit pre-check: Rejects creation if
MAX_CONTAINERS(1,024) already exist - Duplicate ID rejection: Fails fast if container ID already exists
- Container ID validation: Validates ID length and character set before any state mutation
This ensures limit enforcement occurs before any resources are allocated, preventing race conditions and resource leaks.
Security Constants
| Constant | Value | Purpose |
|---|---|---|
MAX_IMAGE_REF_LEN |
512 bytes | Image reference length limit |
MAX_LAYER_SIZE |
512 MiB | Per-layer compressed size limit |
MAX_ROOTFS_SIZE |
4 GiB | Total extracted rootfs limit |
MAX_LAYERS |
128 | Maximum layers per image |
MAX_FILES_PER_LAYER |
100,000 | Inode exhaustion prevention |
MAX_MANIFEST_SIZE |
1 MiB | Manifest parsing limit |
MAX_CONFIG_SIZE |
1 MiB | Config blob parsing limit |
MAX_CONTAINERS |
1,024 | Runtime container limit |
MAX_CONTAINER_ID_LEN |
128 bytes | Container ID length limit |
MAX_WASM_MODULE_SIZE |
256 MiB | WASM module size limit |
MAX_WASM_MEMORY_PAGES |
65,536 | WASM memory limit (4 GiB) |
DEFAULT_WASM_FUEL |
1B ops | WASM instruction limit |
MAX_VM_MEMORY_MIB |
4,096 | MicroVM memory limit |
MAX_VCPUS |
8 | MicroVM vCPU limit |
DEFAULT_MEMORY_BYTES |
256 MiB | Default container memory |
MAX_MEMORY_BYTES |
8 GiB | Maximum container memory |
IMAGE_PULL_TIMEOUT |
300s | Network operation timeout |
CONTAINER_START_TIMEOUT |
60s | Container start timeout |
DEFAULT_GRACE_PERIOD |
30s | Shutdown grace period |
EXEC_TIMEOUT |
300s | Exec command timeout |
MAX_INFLIGHT_BLOBS |
256 | Concurrent download limit |
MAX_PODS |
1,024 | Maximum pods per runtime |
MAX_POD_ID_LEN |
64 bytes | Pod ID length limit |
MAX_CONTAINERS_PER_POD |
16 | Maximum containers per pod |
MAX_VOLUMES_PER_POD |
64 | Maximum volumes per pod |
MAX_ENV_VARS_PER_CONTAINER |
256 | Environment variable limit |
MAX_NAME_LEN |
253 | Pod/container name length limit |
MAX_LABELS_PER_POD |
64 | Pod label count limit |
MAX_ANNOTATIONS_PER_POD |
64 | Pod annotation count limit |
Layer Extraction Security
OCI layer extraction enforces 8 security checks:
| Check | Protection |
|---|---|
| Path traversal | Rejects .. components and absolute paths |
| Symlink depth | Max 40 levels with visited-path cycle detection |
| Hardlink validation | Same depth-tracking and path validation as symlinks |
| Null-byte rejection | Rejects paths containing \0 |
| Size limits | MAX_LAYER_SIZE, MAX_ROOTFS_SIZE, MAX_FILES_PER_LAYER |
| Whiteout handling | TOCTOU-safe using symlink_metadata() |
| File count limits | MAX_FILES_PER_LAYER per layer |
| Atomic extraction | Failures don't leave partial state |
Usage
PRI (Recommended)
use ;
use Duration;
async
OCI Runtime (Low-Level)
use ;
use ;
async
Binaries
| Binary | Location | Purpose |
|---|---|---|
magikrun |
Host | OCI runtime CLI, image management |
vminit |
VM /init |
Minimal init for MicroVMs (PID 1, Linux only) |
vminit
The vminit binary runs as PID 1 inside MicroVMs. It:
- Reads baked pod spec from
/pod/spec.json - Spawns containers using libcontainer
- Reaps zombie processes
- Forwards signals (SIGTERM, SIGINT) to containers
- Listens on TCP :1024 for control protocol (exec/logs/ping)
# Build static binary for VMs (requires vminit feature and Linux target)
Platform Support
| Capability | Linux | macOS | Windows | Detection Method |
|---|---|---|---|---|
| Native containers | ✅ | ❌ | ❌ | /proc/self/ns/pid exists |
| Cgroups v2 | ✅ | ❌ | ❌ | /sys/fs/cgroup/cgroup.controllers |
| Seccomp | ✅ | ❌ | ❌ | /proc/self/seccomp or sysctl |
| MicroVMs (KVM) | ✅ | ❌ | ❌ | /dev/kvm accessible |
| MicroVMs (HVF) | ❌ | ✅ | ❌ | kern.hv_support sysctl |
| WASM | ✅ | ✅ | ✅ | Always available (pure Rust) |
Platform Detection
Use Platform::detect() to discover available capabilities:
use ;
let platform = detect;
println!;
println!; // e.g., "linux/amd64"
if platform.supports_native_containers
if platform.has_hypervisor
// WASM is always available
assert!;
Dependencies
Core
tokio(1.x) - Async runtime with rt-multi-thread, fs, process, sync, timeasync-trait(0.1) - Async trait definitionsserde/serde_json/serde_yaml(0.9) - Serialization (YAML for K8s Pod manifests)thiserror(2.x) - Structured error handlingtracing(0.1) - Structured logging and observability
OCI
oci-spec(0.8) - OCI image manifest typesoci-distribution(0.11) - Registry client with digest verification
Runtimes
libcontainer/libcgroups(0.5) - youki container runtime, Linux only (RUSTSEC-2024-0437 fix)wasmtime/wasmtime-wasi(40) - WASM execution (RUSTSEC-2025-0118, RUSTSEC-2025-0046 fixes)krun-sys(1.10) - libkrun FFI bindings, Linux/macOS only
Security
sha2(0.10) /hex(0.4) - Content-addressed hashing with SHA-256 verificationflate2(1.x) /tar(0.4) - Bounded layer extraction with size limits
Utilities
uuid(1.x, v7) - UUIDv7 generation for container IDschrono(0.4) - Timestampsdirs(6.x) - Platform-appropriate path resolutionfutures(0.3) - Async utilitieslibc(0.2) - System interfaces
Feature Flags
| Feature | Description | Default |
|---|---|---|
testing |
Exposes internal APIs for security testing (layer extraction) | Off |
vminit |
Builds the vminit binary (Linux only, requires additional deps) | Off |
Testing
# Library tests only (fast)
# All tests including integration
# Bundle security tests (path traversal, symlinks, hardlinks, etc.)
# With verbose output
OCI Spec Compliance
| Spec | Version | Notes |
|---|---|---|
| OCI Runtime Spec | 1.0.2 | Full lifecycle support (create/start/kill/delete/state) |
| OCI Image Spec | 1.0.2 | Multi-platform manifests, gzip/zstd layers |
| OCI Distribution Spec | — | Via oci-distribution crate |
Error Handling
All operations return magikrun::Result<T> with structured error types:
| Category | Error Variants | Recovery Strategy |
|---|---|---|
| Container Lifecycle | ContainerNotFound, CreateFailed, InvalidState |
Retry or cleanup |
| Image/Registry | ImagePullFailed, PathTraversal, ImageTooLarge |
Check image ref, network |
| Bundle | BundleBuildFailed, InvalidBundle |
Rebuild from image |
| Runtime | RuntimeUnavailable, NotSupported |
Select different runtime |
| Storage | BlobNotFound, DigestMismatch |
Check disk space, re-pull |
| Timeout | Timeout |
Retry with backoff |
License
Apache-2.0. See LICENSE for details.