codex-asr
Unofficial Rust CLI/library for Codex Desktop's one-shot dictation ASR endpoint.
It reuses a local Codex ChatGPT login from $CODEX_HOME/auth.json or
~/.codex/auth.json, then uploads an audio file to:
https://chatgpt.com/backend-api/transcribe
This is not an official OpenAI API surface. Treat it as a local automation tool for an already-signed-in Codex Desktop environment.
Safety
codex-asr reuses a personal Codex/ChatGPT login token. Do not expose it, or a
REST server backed by it, to the public internet.
- Bind REST servers to loopback by default (
127.0.0.1). - Use
--api-keyorCODEX_ASR_SERVER_KEYfor the local REST wrapper. - Treat the REST API key as local access control only; it is not an upstream OpenAI or ChatGPT token.
- Avoid
--no-api-keyunless the server is reachable only by trusted local processes. - This endpoint is reverse-engineered from Codex Desktop behavior and may change without notice.
Install From Source
When published:
GitHub Release installers are generated with cargo-dist:
|
PowerShell:
powershell -ExecutionPolicy ByPass -c "irm https://github.com/wangnov/codex-asr/releases/download/v0.1.0/codex-asr-installer.ps1 | iex"
Library consumers that do not need the local REST server can avoid the Axum/Tokio server dependencies:
= { = "0.1", = false }
CLI
Auth defaults to local Codex auth. For the smallest explicit input surface, pass only a bearer token:
CODEX_ASR_BEARER=""
ChatGPT-Account-Id is decoded from the bearer token when possible. Override it
only if your token does not contain that claim:
.silk and .slk inputs are decoded with an external rust-silk CLI before
upload. The decoder is resolved in this order:
--silk-decoder <path>CODEX_ASR_SILK_DECODERrust-silkonPATH$HOME/rust-silk/target/release/rust-silk$HOME/rust-silk/target/debug/rust-silk
The default SILK decode sample rate is 24000 Hz. Override it with:
The backend is sensitive to multipart filenames. If an input path has no useful
audio extension but you pass a known --content-type, codex-asr sends a
synthetic filename such as raw-audio.wav. You can override it explicitly:
REST Server
codex-asr can also serve a small OpenAI Whisper-compatible REST surface:
Then call it with an OpenAI-style multipart request:
Implemented routes:
| Route | Notes |
|---|---|
GET /healthz |
no auth required |
POST /v1/audio/transcriptions |
OpenAI-style route |
POST /audio/transcriptions |
short alias |
Implemented multipart fields:
| Field | Handling |
|---|---|
file |
required |
model |
accepted for SDK compatibility, ignored |
language |
forwarded to Codex /transcribe |
response_format |
supports json, text, verbose_json |
prompt, temperature, timestamp_granularities |
accepted and ignored |
srt and vtt response formats return HTTP 400 because the Codex endpoint does
not provide timestamps. REST auth defaults to CODEX_ASR_SERVER_KEY or
--api-key; use --no-api-key only on trusted loopback.
OpenAI SDK examples live in examples/:
CODEX_ASR_SERVER_KEY=local_dev_key \
CODEX_ASR_SERVER_KEY=local_dev_key \
The Python example disables environment proxy discovery for the local SDK client
because some systems route localhost traffic through a proxy unless trust_env
is disabled.
Library
use ;
let client = from_codex_home?;
let result = client.transcribe_file?;
println!;
# Ok::
Audio Formats
The Codex Desktop endpoint appears to inspect the actual audio container, not only the multipart content type.
These formats were tested successfully when uploaded directly:
| Container / codec | Suggested content type |
|---|---|
| WAV PCM | audio/wav |
| MP3 | audio/mpeg |
| M4A or MP4 AAC | audio/mp4 |
| FLAC | audio/flac |
| Ogg Opus | audio/ogg |
| WebM Opus | audio/webm |
Files with no recognizable audio extension should be uploaded with a known
--content-type; the CLI will add a matching multipart filename extension.
These formats are supported by the codex-asr CLI through local preprocessing:
| Input | Handling |
|---|---|
SILK v3 (#!SILK_V3) |
decoded to temporary WAV with rust-silk |
WeChat/Tencent SILK (0x02 + #!SILK_V3) |
decoded to temporary WAV with rust-silk |
These were rejected by the endpoint during local testing when uploaded directly:
| Format | Result |
|---|---|
ADTS AAC (.aac) |
HTTP 500, ASR API error |
| AIFF | HTTP 500, ASR API error |
| CAF AAC | HTTP 500, ASR API error |
| Raw PCM stream | HTTP 500, ASR API error |
SILK v3 (#!SILK_V3) |
HTTP 500, ASR API error |
WeChat/Tencent SILK (0x02 + #!SILK_V3) |
HTTP 500, ASR API error |
Endpoint Notes
Local probes against the Codex Desktop endpoint showed these practical edges:
- Empty files return HTTP 500 with
Error in ASR API. - One second of silence returns HTTP 200 with an empty transcript.
- Very short non-silent clips can return unstable text; avoid treating tiny snippets as reliable.
- A missing or misleading multipart filename extension can make an otherwise
valid audio file fail.
codex-asrcompensates when--content-typeis known. - Parallel batches up to 96 short WAV uploads succeeded locally without 429 or 5xx responses, but this is not a public API contract. Keep any REST wrapper concurrency bounded, especially for longer audio.
Shape
Default shape is CLI + library crate + local REST wrapper. The REST wrapper is kept local-first because this tool handles a user bearer token.