unified-agent-api-codex 0.3.5

Async wrapper around the Codex CLI for programmatic prompting
Documentation

Codex Rust Wrapper

Async wrapper around the Codex CLI focused on the headless codex exec flow. The client shells out to the bundled or system Codex binary, mirrors stdout/stderr when asked, and keeps the parent process environment untouched.

  • crates.io package: unified-agent-api-codex
  • Rust library crate: codex

Capability + versioning release notes (Workstream F)

  • Capability probes now capture codex --version, codex features list (--json when available), and --help hints, storing results as CodexCapabilities snapshots with collected_at timestamps and BinaryFingerprint metadata keyed by canonical binary path.
  • Guard helpers (guard_output_schema, guard_add_dir, guard_mcp_login, guard_features_list) keep optional flags off when support is unknown; surface CapabilityGuard.notes to operators instead of passing flags blindly.
  • Cache controls: configure CapabilityCachePolicy::{PreferCache, Refresh, Bypass} via capability_cache_policy or bypass_capability_cache. Use Refresh for TTL/backoff windows or hot-swaps that reuse the same path; use Bypass when metadata is missing (FUSE/overlay filesystems) or when you need an isolated probe that skips cache reads/writes.
  • TTL/backoff helper: capability_cache_ttl_decision inspects collected_at and fingerprint presence to recommend Refresh vs Bypass for hot-swaps or metadata-missing paths (FUSE/overlay); start with a ~5 minute TTL and back off toward 10-15 minutes when metadata keeps failing.
  • Overrides + persistence: capability_snapshot / capability_overrides accept manual snapshots and feature/version hints; write_capabilities_snapshot, read_capabilities_snapshot, and capability_snapshot_matches_binary let hosts reuse snapshots across processes while avoiding stale data when fingerprints diverge.
  • Update advisories stay offline: supply CodexLatestReleases and call update_advisory_from_capabilities to prompt upgrades without this crate performing network I/O.

Snapshot reuse + cache policy quickstart

Run the snapshot example to see disk reuse with fingerprint checks plus cache policy guidance:

cargo run -p unified-agent-api-codex --example capability_snapshot -- ./codex ./codex-capabilities.json auto
  • The example loads a prior snapshot when the fingerprint matches, falls back to CapabilityCachePolicy::Refresh after a TTL or hot-swap, and drops to CapabilityCachePolicy::Bypass when metadata is missing (typical on some FUSE/overlay mounts) to avoid persisting snapshots that cannot be validated.
  • Refresh vs. Bypass: use Refresh to re-probe while still writing back to the cache (good for TTL/backoff windows or deployments that reuse the same path); use Bypass for one-off probes that should not read or write cache entries when metadata is unreliable.

See crates/codex/examples/capability_snapshot.rs for the full flow, including fingerprint validation and snapshot persistence helpers.

TTL/backoff helper

Use capability_cache_ttl_decision to decide whether to reuse a cached snapshot or force a probe with the right cache policy:

use codex::{
    capability_cache_entry, capability_cache_ttl_decision, CapabilityCachePolicy, CodexClient,
};
use std::{path::Path, time::{Duration, SystemTime}};

async fn refresh_capabilities(client: &CodexClient, binary: &Path) {
    let cached = capability_cache_entry(binary);
    let ttl = Duration::from_secs(300); // start with ~5 minutes for binaries with fingerprints
    let decision = capability_cache_ttl_decision(cached.as_ref(), ttl, SystemTime::now());

    let capabilities = if let Some(snapshot) = cached.filter(|_| !decision.should_probe) {
        snapshot
    } else {
        client.probe_capabilities_with_policy(decision.policy).await
    };

    if decision.policy == CapabilityCachePolicy::Bypass {
        // FUSE/overlay path; back off toward 10-15 minutes to avoid hammering probes.
    }

    let _ = capabilities; // reuse, refresh, or bypass based on the helper decision
}
  • Refresh covers hot-swaps that reuse the same binary path even when fingerprints look unchanged.
  • Bypass is returned when metadata is missing; avoid cache writes and increase the TTL/backoff window to reduce probe churn.

Binary and CODEX_HOME isolation

  • Point the wrapper at a bundled Codex binary via [CodexClientBuilder::binary]; if unset, it honors CODEX_BINARY or falls back to codex on PATH.
  • Apply an app-scoped home with [CodexClientBuilder::codex_home]. The resolved binary is mirrored into CODEX_BINARY, and the provided home is exported as CODEX_HOME for every spawn site (exec/login/status/logout). The parent environment is never mutated.
  • Use [CodexClientBuilder::create_home_dirs] to control whether CODEX_HOME, conversations/, and logs/ are created up front (defaults to true when a home is set). RUST_LOG defaults to error if you have not set it.
use codex::{CodexClient, CodexHomeLayout};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let binary = "/opt/myapp/bin/codex";
    let codex_home = "/var/lib/myapp/codex";

    // Discover (and optionally create) the CODEX_HOME layout.
    let layout = CodexHomeLayout::new(codex_home);
    layout.materialize(true)?;
    println!("Logs live at {}", layout.logs_dir().display());

    let client = CodexClient::builder()
        .binary(binary)
        .codex_home(codex_home)
        .create_home_dirs(true)
        .mirror_stdout(false)
        .quiet(true)
        .build();

    let reply = client.send_prompt("Health check").await?;
    println!("{reply}");
    Ok(())
}

CODEX_HOME layout helper

CodexHomeLayout documents where Codex stores state under an app-scoped home:

  • config.toml
  • auth.json
  • .credentials.json
  • history.jsonl
  • conversations/ for transcript JSONL files
  • logs/ for codex-*.log files

Call [CodexHomeLayout::materialize] to create the root, conversations/, and logs/ directories before spawning Codex.

Stream JSONL events

Use the streaming surface to consume codex exec --json output as it arrives. Disable stdout mirroring so you control the console, and set an idle timeout to fail fast if the CLI stalls.

use codex::{CodexClient, ExecStreamRequest, ThreadEvent};
use futures_util::StreamExt;
use std::{path::PathBuf, time::Duration};

# async fn demo() -> Result<(), Box<dyn std::error::Error>> {
let client = CodexClient::builder()
    .json(true)
    .quiet(true)
    .mirror_stdout(false)
    .json_event_log("logs/codex_events.log")
    .build();

let mut stream = client
    .stream_exec(ExecStreamRequest {
        prompt: "List repo files".into(),
        idle_timeout: Some(Duration::from_secs(30)),
        output_last_message: Some(PathBuf::from("last_message.txt")),
        output_schema: None,
        json_event_log: None, // override per request if desired
    })
    .await?;

while let Some(event) = stream.events.next().await {
    match event {
        Ok(ThreadEvent::ItemDelta(delta)) => println!("delta: {:?}", delta.delta),
        Ok(other) => println!("event: {other:?}"),
        Err(err) => {
            eprintln!("stream error: {err}");
            break;
        }
    }
}

let completion = stream.completion.await?;
println!("codex exited with {}", completion.status);
if let Some(path) = completion.last_message_path {
    println!("last message saved to {}", path.display());
}
# Ok(()) }

Log the raw JSON stream

Set json_event_log on the builder or per request to tee every raw JSONL line to disk before parsing:

  • The log is appended to (existing files are preserved) and flushed per line.
  • Parent directories are created automatically.
  • An empty string is ignored; set a real path or leave None to disable.
  • The per-request json_event_log overrides the builder default for that run.

Events still flow to your events stream even when teeing is enabled.

Apply or inspect task diffs

CodexClient::apply_task wraps codex apply <TASK_ID>, and CodexClient::cloud_diff_task wraps codex cloud diff <TASK_ID> when supported by the binary. CodexClient::apply/CodexClient::diff are convenience helpers that will append <TASK_ID> from CODEX_TASK_ID when set.

All of these capture stdout/stderr and return the exit status via ApplyDiffArtifacts. They honor the builder flags you already use for streaming:

  • mirror_stdout controls whether stdout is echoed while still being captured.
  • quiet suppresses stderr mirroring (stderr is always returned in the artifacts).
  • RUST_LOG defaults to error for these subcommands when the environment is unset; set RUST_LOG=info (or higher) to inspect codex internals.
use codex::CodexClient;

# async fn demo() -> Result<(), Box<dyn std::error::Error>> {
let client = CodexClient::builder()
    .mirror_stdout(false) // silence stdout while capturing
    .quiet(true)          // silence stderr while capturing
    .build();

let apply = client.apply_task("t-123").await?;
println!("exit: {}", apply.status);
println!("stdout: {}", apply.stdout);
println!("stderr: {}", apply.stderr);
# Ok(()) }

RUST_LOG defaults

If RUST_LOG is unset, the wrapper injects RUST_LOG=error for spawned commands to silence verbose upstream tracing. Any existing RUST_LOG value is respected.

MCP + app-server helpers

  • codex::mcp offers typed clients for codex mcp-server --stdio and codex app-server --stdio, along with config managers for [mcp_servers] and [app_runtimes] plus launcher helpers when you want to spawn from saved config.
  • Use CodexClient::spawn_mcp_login_process (capability-guarded) when you need an interactive bearer token for HTTP transports before persisting it via McpConfigManager::login.
  • Examples: mcp_codex_flow (typed tools/call for codex + codex-reply with optional cancellation), mcp_codex_tool/mcp_codex_reply (raw tool calls with --sample payloads; use the session_id from session_configured as the conversationId, and note that codex-reply requires the session to remain active inside the same mcp-server process on 0.61.0), and app_server_turns/app_server_thread_turn (thread start/resume + optional interrupt). Pair these with feature_detection if the binary may be missing server endpoints.
  • MCP codex-reply does not rehydrate conversations from disk on 0.61.0; follow-up calls only work while the original mcp-server process is still running. For cross-process resumes, use codex exec resume (CLI) or the app-server thread/resume path instead.

Runtime definitions and env prep

  • [mcp_servers] and [app_runtimes] live in config.toml; McpConfigManager reads/writes them.
  • StdioServerConfig should be built with the Workstream A env prep (binary path, CODEX_HOME, base env, timeouts). Runtime entries layer env/timeout overrides on top of those defaults, and CODEX_HOME is injected when code_home is set.
  • Resolution through the runtime/app APIs is read-only: stored config and metadata are not mutated.

MCP runtime API (read-only)

  • McpRuntimeApi::from_config(&manager, &defaults) loads launch-ready stdio configs or HTTP connectors from stored runtimes.
  • available returns McpRuntimeSummary entries (description/tags/tool hints + transport kind).
  • launcher, stdio_launcher, and http_connector hand back launchers/connectors without side effects; HTTP connectors resolve bearer tokens from env without overwriting existing Authorization headers.
  • prepare spawns stdio runtimes or hands back HTTP connectors with tool hints preserved; use ManagedStdioRuntime::stop to shut down processes (drop is best-effort kill).
  • Use McpRuntimeManager directly when you already have launchers and only need spawn/connector plumbing.

App runtime API (read-only)

  • AppRuntimeApi::from_config(&manager, &defaults) merges stored [app_runtimes] entries with defaults (binary/path/env/timeout) while keeping metadata/resume hints intact.
  • available lists stored runtimes and metadata; prepare/stdio_config return merged stdio configs without launching.
  • start launches an app-server and returns ManagedAppRuntime (metadata + merged env + CodexAppServer handle). Calls leave stored definitions untouched and preserve metadata for future starts.

Pooled app runtimes

  • AppRuntimePoolApi::from_config(&manager, &defaults) (or AppRuntimeApi::pool_api) wraps the pool that reuses running runtimes by name.
  • available lists stored entries; running lists active runtimes; start reuses an existing process if one is already running; stop/stop_all clean up without altering stored definitions or metadata/resume hints.
  • Pool handles still expose stdio configs via launcher/prepare so callers can inspect launch parameters without starting a process.

Examples and tests

  • examples/mcp_codex_flow.rs: starts codex mcp-server, streams codex/event, supports $ /cancelRequest and follow-up codex/codex-reply via tools/call; respects CODEX_BINARY/CODEX_HOME and does not touch stored [mcp_servers].
  • examples/app_server_turns.rs: starts/resumes codex app-server threads, streams items/task_complete, and can issue turn/interrupt after the first item; metadata/thread IDs come from server responses and are not persisted by the wrapper.
  • examples/responses_api_proxy.rs: launches codex responses-api-proxy with an API key piped on stdin; falls back to a stub --sample path when no OPENAI_API_KEY/CODEX_API_KEY is available and polls --server-info for {port,pid}.
  • examples/stdio_to_uds_live.rs: Unix-only live bridge that spins up a temp Unix socket listener, runs codex stdio-to-uds <socket>, sends ping, and prints the echoed pong.
  • cargo test -p unified-agent-api-codex exercises env merging and non-destructive behavior (runtime_api_*, app_runtime_*, app_runtime_pool_* cover listing/prepare/start/stop without writing config or altering metadata).
  • See crates/codex/EXAMPLES.md for one-to-one CLI parity examples, including bundled_binary_home to run Codex from an embedded binary with isolated state.

Integration notes

  • For a practical integration pattern in an async shell/orchestrator (Substrate), see docs/integrations/substrate.md.