procpilot
Production-grade subprocess runner for Rust — typed errors, retry with backoff, timeout with pipe-draining, binary-safe output.
Built for CLI tools that need to spawn external processes and handle failure modes precisely. Not intended for shell scripting (see xshell for that).
Why not std::process::Command?
Command::output() returns an Output with a status field the caller must remember to check. Command::spawn() gives you a Child but no help with timeout, retry, or deadlock-safe pipe draining. Building a production CLI tool on Command alone means writing the same wrapping layer every time.
procpilot is that layer:
- Typed errors — [
RunError] distinguishesSpawn(couldn't start — binary missing, fork failed),NonZeroExit(command ran and reported failure with captured stdout/stderr), andTimeout(killed after exceeding budget). Callers can match to handle each appropriately. - Retry with exponential backoff — [
run_with_retry] wraps any subprocess call with a user-supplied "is this error transient?" predicate. - Timeout with pipe-draining — background threads drain stdout/stderr while waiting, so chatty processes don't block on buffer overflow and fail to respond to the kill signal.
- Binary-safe output — stdout is
Vec<u8>(faithful for image/binary content);stdout_lossy()gives a zero-copyCow<str>for text. - Environment variables — [
run_cmd_in_with_env] handles theGIT_INDEX_FILE/SSH_AUTH_SOCK/ etc. cases without dropping back toCommand.
Usage
[]
= "0.1"
Running commands
use ;
// Basic: run a command, get captured output
let output = run_cmd?;
assert_eq!;
// In a specific directory
let output = run_cmd_in?;
// Binary-safe output for image/binary content
let output = run_cmd_in?;
let image_bytes: = output.stdout;
Handling "command ran and said no"
procpilot returns Result<RunOutput, RunError>. The three-variant enum lets callers distinguish infrastructure failure from command-reported failure from timeouts:
use ;
match run_cmd
# ; Ok::
RunError implements std::error::Error, so ? into anyhow::Result works when you don't care about the distinction.
Inspection methods on RunError:
err.is_non_zero_exit()/err.is_spawn_failure()/err.is_timeout()— check the varianterr.stderr()— captured stderr onNonZeroExit/Timeout,NoneonSpawnerr.exit_status()— exit status onNonZeroExit,Noneon otherserr.program()— the program name that failed
RunError is marked #[non_exhaustive], so future variants won't break your match arms — include a wildcard fallback.
Timeouts
For commands that might hang (network operations, unreachable remotes, user-supplied queries), use the timeout variant:
use Duration;
use ;
match run_cmd_in_with_timeout
# ; Ok::
Output collected before the kill is returned in the Timeout error variant.
Caveat on grandchildren: the kill signal reaches only the direct child. A shell wrapper like sh -c "slow-cmd" forks the target as a grandchild that survives the shell's kill. Use exec in the shell (sh -c "exec slow-cmd") or invoke the target directly.
Retry on transient errors
use ;
// Retry when stderr looks like a lock-contention error
let output = run_with_retry?;
# ; Ok::
Uses exponential backoff (100ms, 200ms, 400ms) with up to 3 retries. The predicate is impl Fn(&RunError) -> bool so it can capture state.
Environment variables
use run_cmd_in_with_env;
// Run with a custom GIT_INDEX_FILE (detects unstaged renames)
let output = run_cmd_in_with_env?;
# ; Ok::
Inherited I/O
When the user should see output directly (e.g., running cargo test and letting test output stream):
use run_cmd_inherited;
run_cmd_inherited?;
# ; Ok::
Binary availability
use ;
if binary_available
Design decisions
Vec<u8>stdout,Stringstderr. Stdout can be binary; stderr is conventionally text. Asymmetric by design.NonZeroExitis an error, not a normal return. Forces the caller to opt into handling command-reported failures viamatch, rather than accidentally ignoring a missed status check.- Timeouts drain pipes in threads. A simple
wait_timeoutwithout draining will hang forever if the child fills the pipe buffer. This is a subtle production-grade correctness concern that a scripting library can skip. - No trait abstraction.
Vcs-style traits belong in consumer code where the specific needs are known.procpilotprovides primitives. #[non_exhaustive]onRunError. New failure modes can be added without breaking callers.
License
Licensed under either Apache-2.0 or MIT at your option.