# AGENTS.md
This file applies to the entire repository.
## Project Summary
`prodex` is a single-binary Rust CLI that wraps `codex` and manages multiple isolated `CODEX_HOME` profiles.
The project currently lives mostly in one file:
- `src/main.rs`: CLI, profile management, quota logic, runtime proxy, tests
- `README.md`: full user-facing documentation
- `QUICKSTART.md`: shorter installation and usage guide
## Core Principles
When changing `prodex`, keep these invariants intact:
1. The runtime proxy should be as transport-transparent as possible.
- Let `codex` own reconnect, WebSocket fallback, and stream UX.
- Do not invent new stream semantics unless strictly necessary.
2. Auto-rotate must remain built in to the proxy.
- Profile/account selection is a `prodex` responsibility.
- Transport behavior should remain as close as possible to upstream Codex.
- Reliability improvements must not weaken affinity or allow mid-stream rotation.
3. Do not redefine upstream ChatGPT errors unless the proxy itself failed before any upstream response existed.
- Prefer pass-through for upstream HTTP status, body, and stream payloads.
4. Do not print anything to the terminal while the Codex TUI is running.
- Preflight output before launch is fine.
- Runtime notices must go to log files, not stdout/stderr.
5. Repository prose must stay in English.
6. Runtime hot paths must stay non-blocking as much as possible.
- Do not reintroduce disk I/O, broad file reads, or unbounded thread spawning into the request/stream hot path.
- Prefer async transport and bounded background work over ad hoc blocking behavior.
## Runtime Proxy Rules
The runtime proxy is the most sensitive part of the project.
### Required affinity behavior
These bindings must remain reliable:
- `previous_response_id -> profile`
- `x-codex-turn-state -> profile`
- `session_id -> profile` for session-scoped unary routes such as remote compact
If a request continues an existing chain, it should stay on the owning profile whenever possible.
### Rotation boundaries
Safe auto-rotate is allowed only before a request/stream is committed:
- before the first successful unary response is accepted
- before the first streaming response is committed
- before a quota-blocked or overload response is returned to Codex
Do not rotate mid-stream after model output has started.
For fresh requests without hard affinity, a single last-chance attempt on the current profile is acceptable when only local selection heuristics were exhausted.
That fallback must not override:
- `previous_response_id` ownership
- `x-codex-turn-state` ownership
- `session_id` ownership for an existing session-scoped route
- mid-stream no-rotate rules
### Transport transparency
Keep proxy behavior close to upstream Codex:
- WebSocket upstream sessions should be reused where appropriate.
- HTTP/SSE should stream as directly as possible.
- If upstream transport breaks, prefer letting Codex observe a natural transport failure.
### Reliability guardrails
The runtime proxy should remain conservative and durable under poor networks and many terminals:
- Keep long-lived request handling bounded; avoid unbounded `thread::spawn` patterns in acceptor paths.
- Treat transport failures separately from quota failures.
- Treat short-lived profile health as a separate signal from quota backoff and transport backoff.
- Treat short-lived profile health as endpoint-specific where possible, so `responses`, `/responses/compact`, and websocket transport can degrade independently for fresh selection.
- Fresh pre-commit selection may use a short-lived per-profile in-flight load signal to avoid creating hotspots.
- Fresh pre-commit selection may also enforce a short per-profile in-flight cap so new work fails fast instead of piling more pressure onto a busy account.
- Local proxy admission may also enforce short lane-aware caps so `responses`, `compact`, `websocket`, and other unary traffic do not starve each other.
- Lane-aware admission limits are for fresh local admission only; they must not override hard affinity for an existing continuation that already owns a profile.
- Lane-aware admission should prefer protecting the main `responses` lane from starvation by bursty `compact`, websocket, or other unary traffic.
- Temporary connect/read/stream transport failures may place a profile into short transport backoff.
- Temporary overload or repeated transport flakiness may add a short-lived profile health penalty that affects only new candidate selection.
- Endpoint-specific health penalties must not globally poison unrelated fresh routes unless there is a deliberate reason to do so.
- Do not treat a generic upstream `429 Too Many Requests` body as account-specific quota unless the upstream payload explicitly identifies a quota/rate-limit error code such as `insufficient_quota` or `rate_limit_exceeded`.
- If pre-commit selection fails before any upstream response exists, prefer a local `503 service_unavailable` over a synthetic `429 insufficient_quota`.
- Do not let transport backoff override hard affinity for an in-flight continuation that already owns a profile.
- Do not let temporary profile health penalties override hard affinity for an in-flight continuation that already owns a profile.
- Do not let temporary in-flight load heuristics override hard affinity for an in-flight continuation that already owns a profile.
- Do not let the per-profile in-flight hard cap override hard affinity for an in-flight continuation that already owns a profile.
- Keep pre-commit candidate selection bounded in both time and attempts so the proxy fails fast when the whole pool is unhealthy.
- Runtime state saves must not block request/stream commit paths.
- Cross-process state persistence should remain merge-safe for:
- `active_profile`
- `last_run_selected_at`
- `response_profile_bindings`
- `session_profile_bindings`
### Unary compact path
Remote compaction uses the unary endpoint:
- `/responses/compact`
This path should remain eligible for safe retry/rotate on temporary overload or quota exhaustion, while other unary errors should pass through unchanged.
When `session_id` is present and already bound to a profile, compact should prefer that owning profile before fresh unary selection.
For `429` on unary paths:
- only rotate when the upstream payload clearly signals quota exhaustion
- plain-text or generic `429` responses should pass through unchanged
## Headers and Metadata
Preserve upstream request metadata unless it is truly hop-by-hop or auth that must be replaced for the selected profile.
Important headers to preserve when present:
- `session_id`
- `x-openai-subagent`
- `x-codex-turn-state`
- `x-codex-turn-metadata`
- `x-codex-beta-features`
- request `User-Agent`
Headers that are intentionally replaced by the proxy for the selected profile:
- `Authorization`
- `ChatGPT-Account-Id`
Headers that may be skipped as transport-local:
- `Host`
- `Connection`
- `Content-Length`
- `Transfer-Encoding`
- `Upgrade`
- `sec-websocket-*`
## Quota UX
`prodex quota` is a Prodex-owned screen, not a Codex TUI path.
- By default, `prodex quota` should refresh continuously every 5 seconds.
- This default applies to both single-profile quota views and `prodex quota --all`.
- `prodex quota --raw` should remain one-shot.
- `prodex quota --once` is the explicit one-shot escape hatch for human-facing quota views.
- During a live quota refresh, the previous snapshot should stay visible until the next snapshot is fully ready to render.
- The live `prodex quota --all` view may truncate to the current terminal height, but it must preserve the existing sort order, show the top rows that fit, and surface how many profiles are hidden.
- When changing quota behavior, keep integration tests and docs aligned so snapshot-style tests use `--once`.
## Observability
Runtime proxy diagnostics are written to `/tmp`.
Useful files:
- `/tmp/prodex-runtime-latest.path`: pointer to the latest runtime log
- `/tmp/prodex-runtime-*.log`: per-run proxy logs
If a user reports a stall, inspect the latest runtime log before changing behavior blindly.
Look for:
- `runtime_proxy_queue_overloaded`
- `runtime_proxy_active_limit_reached`
- `runtime_proxy_lane_limit_reached`
- `runtime_proxy_overload_backoff`
- `profile_inflight_saturated`
- `upstream_connect_*`
- `first_upstream_chunk`
- `first_local_chunk`
- `stream_read_error`
- `profile_retry_backoff`
- `profile_transport_backoff`
- `profile_inflight`
- `profile_health`
- `precommit_budget_exhausted`
- `state_save_*`
If `profile_health` appears, inspect its `route=` value before changing selection behavior globally.
If `runtime_proxy_lane_limit_reached` appears, inspect its `lane=` value before changing upstream-facing behavior.
Repeated `lane=responses` markers suggest the main model lane is saturated locally; repeated non-`responses` markers suggest a side lane is consuming proxy capacity.
If `runtime_proxy_active_limit_reached` or `profile_inflight_saturated` appears repeatedly without matching transport or quota markers, suspect local concurrency pressure before changing upstream-facing behavior.
## Key Commands
Format:
```bash
cargo fmt
```
Run the focused runtime proxy tests:
```bash
cargo test -q runtime_proxy_ -- --test-threads=1
```
Run the full test suite:
```bash
cargo test -q -- --test-threads=1
```
Summarize the latest runtime log:
```bash
prodex doctor --runtime
```
Show quota as a one-shot snapshot:
```bash
prodex quota --all --once
```
Reinstall the local binary after runtime changes:
```bash
cargo install --path . --force
```
If you changed dependencies or release metadata, refresh the lockfile before publishing:
```bash
cargo update
```
## Editing Guidance
- Prefer narrow, behavior-preserving changes in `src/main.rs`.
- Add regression tests for every runtime proxy bug fix.
- When touching runtime persistence, add or update tests for multi-process-safe merge behavior.
- When touching transport recovery, add or update tests for both quota backoff and transport backoff behavior.
- When touching runtime candidate selection, add or update tests for:
- hard affinity preservation
- transport backoff handling
- temporary profile health handling
- bounded pre-commit retry/selection behavior
- When touching proxy logic, compare behavior against upstream Codex in:
- `codex-rs/core/src/client.rs`
- `codex-rs/core/src/compact_remote.rs`
- `codex-rs/codex-api/src/sse/responses.rs`
- `codex-rs/codex-api/src/endpoint/responses_websocket.rs`
## Release Notes
This project has been released frequently.
If asked to publish:
1. bump `Cargo.toml`
2. update `Cargo.lock`
3. run tests
4. run `cargo publish --dry-run`
5. run `cargo publish`
If asked to commit, use a conventional commit message.