Codex Rust Wrapper

Async wrapper around the Codex CLI focused on the headless codex exec flow. The client shells out to the bundled or system Codex binary, mirrors stdout/stderr when asked, and keeps the parent process environment untouched.

crates.io package: unified-agent-api-codex
Rust library crate: codex

Capability + versioning release notes (Workstream F)

Capability probes now capture codex --version, codex features list (--json when available), and --help hints, storing results as CodexCapabilities snapshots with collected_at timestamps and BinaryFingerprint metadata keyed by canonical binary path.
Guard helpers (guard_output_schema, guard_add_dir, guard_mcp_login, guard_features_list) keep optional flags off when support is unknown; surface CapabilityGuard.notes to operators instead of passing flags blindly.
Cache controls: configure CapabilityCachePolicy::{PreferCache, Refresh, Bypass} via capability_cache_policy or bypass_capability_cache. Use Refresh for TTL/backoff windows or hot-swaps that reuse the same path; use Bypass when metadata is missing (FUSE/overlay filesystems) or when you need an isolated probe that skips cache reads/writes.
TTL/backoff helper: capability_cache_ttl_decision inspects collected_at and fingerprint presence to recommend Refresh vs Bypass for hot-swaps or metadata-missing paths (FUSE/overlay); start with a ~5 minute TTL and back off toward 10-15 minutes when metadata keeps failing.
Overrides + persistence: capability_snapshot / capability_overrides accept manual snapshots and feature/version hints; write_capabilities_snapshot, read_capabilities_snapshot, and capability_snapshot_matches_binary let hosts reuse snapshots across processes while avoiding stale data when fingerprints diverge.
Update advisories stay offline: supply CodexLatestReleases and call update_advisory_from_capabilities to prompt upgrades without this crate performing network I/O.

Snapshot reuse + cache policy quickstart

Run the snapshot example to see disk reuse with fingerprint checks plus cache policy guidance:

cargo run -p unified-agent-api-codex --example capability_snapshot -- ./codex ./codex-capabilities.json auto

The example loads a prior snapshot when the fingerprint matches, falls back to CapabilityCachePolicy::Refresh after a TTL or hot-swap, and drops to CapabilityCachePolicy::Bypass when metadata is missing (typical on some FUSE/overlay mounts) to avoid persisting snapshots that cannot be validated.
Refresh vs. Bypass: use Refresh to re-probe while still writing back to the cache (good for TTL/backoff windows or deployments that reuse the same path); use Bypass for one-off probes that should not read or write cache entries when metadata is unreliable.

See crates/codex/examples/capability_snapshot.rs for the full flow, including fingerprint validation and snapshot persistence helpers.

TTL/backoff helper

Use capability_cache_ttl_decision to decide whether to reuse a cached snapshot or force a probe with the right cache policy:

use codex::{
    capability_cache_entry, capability_cache_ttl_decision, CapabilityCachePolicy, CodexClient,
};
use std::{path::Path, time::{Duration, SystemTime}};

async fn refresh_capabilities(client: &CodexClient, binary: &Path) {
    let cached = capability_cache_entry(binary);
    let ttl = Duration::from_secs(300); // start with ~5 minutes for binaries with fingerprints
    let decision = capability_cache_ttl_decision(cached.as_ref(), ttl, SystemTime::now());

    let capabilities = if let Some(snapshot) = cached.filter(|_| !decision.should_probe) {
        snapshot
    } else {
        client.probe_capabilities_with_policy(decision.policy).await
    };

    if decision.policy == CapabilityCachePolicy::Bypass {
        // FUSE/overlay path; back off toward 10-15 minutes to avoid hammering probes.
    }

    let _ = capabilities; // reuse, refresh, or bypass based on the helper decision
}

Refresh covers hot-swaps that reuse the same binary path even when fingerprints look unchanged.
Bypass is returned when metadata is missing; avoid cache writes and increase the TTL/backoff window to reduce probe churn.

Binary and `CODEX_HOME` isolation

Point the wrapper at a bundled Codex binary via [CodexClientBuilder::binary]; if unset, it honors CODEX_BINARY or falls back to codex on PATH.
Apply an app-scoped home with [CodexClientBuilder::codex_home]. The resolved binary is mirrored into CODEX_BINARY, and the provided home is exported as CODEX_HOME for every spawn site (exec/login/status/logout). The parent environment is never mutated.
Use [CodexClientBuilder::create_home_dirs] to control whether CODEX_HOME, conversations/, and logs/ are created up front (defaults to true when a home is set). RUST_LOG defaults to error if you have not set it.

use codex::{CodexClient, CodexHomeLayout};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let binary = "/opt/myapp/bin/codex";
    let codex_home = "/var/lib/myapp/codex";

    // Discover (and optionally create) the CODEX_HOME layout.
    let layout = CodexHomeLayout::new(codex_home);
    layout.materialize(true)?;
    println!("Logs live at {}", layout.logs_dir().display());

    let client = CodexClient::builder()
        .binary(binary)
        .codex_home(codex_home)
        .create_home_dirs(true)
        .mirror_stdout(false)
        .quiet(true)
        .build();

    let reply = client.send_prompt("Health check").await?;
    println!("{reply}");
    Ok(())
}

`CODEX_HOME` layout helper

CodexHomeLayout documents where Codex stores state under an app-scoped home:

config.toml
auth.json
.credentials.json
history.jsonl
conversations/ for transcript JSONL files
logs/ for codex-*.log files

Call [CodexHomeLayout::materialize] to create the root, conversations/, and logs/ directories before spawning Codex.

Stream JSONL events

Use the streaming surface to consume codex exec --json output as it arrives. Disable stdout mirroring so you control the console, and set an idle timeout to fail fast if the CLI stalls.

use codex::{CodexClient, ExecStreamRequest, ThreadEvent};
use futures_util::StreamExt;
use std::{path::PathBuf, time::Duration};

# async fn demo() -> Result<(), Box<dyn std::error::Error>> {
let client = CodexClient::builder()
    .json(true)
    .quiet(true)
    .mirror_stdout(false)
    .json_event_log("logs/codex_events.log")
    .build();

let mut stream = client
    .stream_exec(ExecStreamRequest {
        prompt: "List repo files".into(),
        idle_timeout: Some(Duration::from_secs(30)),
        output_last_message: Some(PathBuf::from("last_message.txt")),
        output_schema: None,
        json_event_log: None, // override per request if desired
    })
    .await?;

while let Some(event) = stream.events.next().await {
    match event {
        Ok(ThreadEvent::ItemDelta(delta)) => println!("delta: {:?}", delta.delta),
        Ok(other) => println!("event: {other:?}"),
        Err(err) => {
            eprintln!("stream error: {err}");
            break;
        }
    }
}

let completion = stream.completion.await?;
println!("codex exited with {}", completion.status);
if let Some(path) = completion.last_message_path {
    println!("last message saved to {}", path.display());
}
# Ok(()) }

Log the raw JSON stream

Set json_event_log on the builder or per request to tee every raw JSONL line to disk before parsing:

The log is appended to (existing files are preserved) and flushed per line.
Parent directories are created automatically.
An empty string is ignored; set a real path or leave None to disable.
The per-request json_event_log overrides the builder default for that run.

Events still flow to your events stream even when teeing is enabled.

Apply or inspect task diffs

CodexClient::apply_task wraps codex apply <TASK_ID>, and CodexClient::cloud_diff_task wraps codex cloud diff <TASK_ID> when supported by the binary. CodexClient::apply/CodexClient::diff are convenience helpers that will append <TASK_ID> from CODEX_TASK_ID when set.

All of these capture stdout/stderr and return the exit status via ApplyDiffArtifacts. They honor the builder flags you already use for streaming:

mirror_stdout controls whether stdout is echoed while still being captured.
quiet suppresses stderr mirroring (stderr is always returned in the artifacts).
RUST_LOG defaults to error for these subcommands when the environment is unset; set RUST_LOG=info (or higher) to inspect codex internals.

use codex::CodexClient;

# async fn demo() -> Result<(), Box<dyn std::error::Error>> {
let client = CodexClient::builder()
    .mirror_stdout(false) // silence stdout while capturing
    .quiet(true)          // silence stderr while capturing
    .build();

let apply = client.apply_task("t-123").await?;
println!("exit: {}", apply.status);
println!("stdout: {}", apply.stdout);
println!("stderr: {}", apply.stderr);
# Ok(()) }

RUST_LOG defaults

If RUST_LOG is unset, the wrapper injects RUST_LOG=error for spawned commands to silence verbose upstream tracing. Any existing RUST_LOG value is respected.

MCP + app-server helpers

codex::mcp offers typed clients for codex mcp-server --stdio and codex app-server --stdio, along with config managers for [mcp_servers] and [app_runtimes] plus launcher helpers when you want to spawn from saved config.
Use CodexClient::spawn_mcp_login_process (capability-guarded) when you need an interactive bearer token for HTTP transports before persisting it via McpConfigManager::login.
Examples: mcp_codex_flow (typed tools/call for codex + codex-reply with optional cancellation), mcp_codex_tool/mcp_codex_reply (raw tool calls with --sample payloads; use the session_id from session_configured as the conversationId, and note that codex-reply requires the session to remain active inside the same mcp-server process on 0.61.0), and app_server_turns/app_server_thread_turn (thread start/resume + optional interrupt). Pair these with feature_detection if the binary may be missing server endpoints.
MCP codex-reply does not rehydrate conversations from disk on 0.61.0; follow-up calls only work while the original mcp-server process is still running. For cross-process resumes, use codex exec resume (CLI) or the app-server thread/resume path instead.

Runtime definitions and env prep

[mcp_servers] and [app_runtimes] live in config.toml; McpConfigManager reads/writes them.
StdioServerConfig should be built with the Workstream A env prep (binary path, CODEX_HOME, base env, timeouts). Runtime entries layer env/timeout overrides on top of those defaults, and CODEX_HOME is injected when code_home is set.
Resolution through the runtime/app APIs is read-only: stored config and metadata are not mutated.

MCP runtime API (read-only)

McpRuntimeApi::from_config(&manager, &defaults) loads launch-ready stdio configs or HTTP connectors from stored runtimes.
available returns McpRuntimeSummary entries (description/tags/tool hints + transport kind).
launcher, stdio_launcher, and http_connector hand back launchers/connectors without side effects; HTTP connectors resolve bearer tokens from env without overwriting existing Authorization headers.
prepare spawns stdio runtimes or hands back HTTP connectors with tool hints preserved; use ManagedStdioRuntime::stop to shut down processes (drop is best-effort kill).
Use McpRuntimeManager directly when you already have launchers and only need spawn/connector plumbing.

App runtime API (read-only)

AppRuntimeApi::from_config(&manager, &defaults) merges stored [app_runtimes] entries with defaults (binary/path/env/timeout) while keeping metadata/resume hints intact.
available lists stored runtimes and metadata; prepare/stdio_config return merged stdio configs without launching.
start launches an app-server and returns ManagedAppRuntime (metadata + merged env + CodexAppServer handle). Calls leave stored definitions untouched and preserve metadata for future starts.

Pooled app runtimes

AppRuntimePoolApi::from_config(&manager, &defaults) (or AppRuntimeApi::pool_api) wraps the pool that reuses running runtimes by name.
available lists stored entries; running lists active runtimes; start reuses an existing process if one is already running; stop/stop_all clean up without altering stored definitions or metadata/resume hints.
Pool handles still expose stdio configs via launcher/prepare so callers can inspect launch parameters without starting a process.

Examples and tests

examples/mcp_codex_flow.rs: starts codex mcp-server, streams codex/event, supports $ /cancelRequest and follow-up codex/codex-reply via tools/call; respects CODEX_BINARY/CODEX_HOME and does not touch stored [mcp_servers].
examples/app_server_turns.rs: starts/resumes codex app-server threads, streams items/task_complete, and can issue turn/interrupt after the first item; metadata/thread IDs come from server responses and are not persisted by the wrapper.
examples/responses_api_proxy.rs: launches codex responses-api-proxy with an API key piped on stdin; falls back to a stub --sample path when no OPENAI_API_KEY/CODEX_API_KEY is available and polls --server-info for {port,pid}.
examples/stdio_to_uds_live.rs: Unix-only live bridge that spins up a temp Unix socket listener, runs codex stdio-to-uds <socket>, sends ping, and prints the echoed pong.
cargo test -p unified-agent-api-codex exercises env merging and non-destructive behavior (runtime_api_*, app_runtime_*, app_runtime_pool_* cover listing/prepare/start/stop without writing config or altering metadata).
See crates/codex/EXAMPLES.md for one-to-one CLI parity examples, including bundled_binary_home to run Codex from an embedded binary with isolated state.

Integration notes

For a practical integration pattern in an async shell/orchestrator (Substrate), see docs/integrations/substrate.md.

unified-agent-api-codex 0.3.5