opencode-voice
Voice control for OpenCode AI coding assistant
[Apache 2.0] [Rust >=1.70]
Overview
opencode-voice is a standalone CLI tool that captures microphone audio, transcribes it locally using whisper-rs (bindings to whisper.cpp), and injects the resulting text into OpenCode via its HTTP API.
Everything runs on your machine — no cloud services, no API keys, no audio leaves your device.
Microphone → cpal → PCM → whisper-rs → text → OpenCode HTTP API → prompt
Key features
- Local transcription — whisper.cpp runs entirely offline, no cloud, no API keys
- Push-to-talk — hold a key to record, release to transcribe and inject
- Global hotkeys — optional system-wide hotkey support via rdev (no terminal focus required)
- Approval mode — voice-driven permission and question handling for OpenCode
- Compact terminal UI — live status display with recording level meter and last transcript preview
- Configurable — model size, audio device, toggle key, and more
Prerequisites
Rust toolchain (1.70+)
Install from https://rustup.rs:
|
cmake (required to build whisper-rs)
# macOS
# Linux (Debian/Ubuntu)
# Linux (Fedora)
C compiler
- macOS: Xcode Command Line Tools —
xcode-select --install - Linux:
gccorclang(usually pre-installed)
OpenCode with HTTP server enabled
OpenCode does not expose an HTTP server by default. You must start it with the --port flag:
Build
The binary is produced at target/release/opencode-voice.
Install
This installs opencode-voice to ~/.cargo/bin/ (which should be on your $PATH after installing Rust).
Setup (First Run)
Download a transcription model:
This downloads the GGML model file (default: base.en, ~150 MB) to the platform data directory:
| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/opencode-voice/ |
| Linux | ~/.local/share/opencode-voice/ |
Model options
| Model | Size | Speed | Accuracy |
|---|---|---|---|
tiny.en |
~75 MB | Fastest | Basic |
base.en |
~150 MB | Balanced | Good (default) |
small.en |
~500 MB | Slower | Best |
To set up with a specific model:
Usage
Step 1 — Start OpenCode with the HTTP server enabled:
Step 2 — In a separate terminal, start opencode-voice:
Push-to-talk (default)
Hold the toggle key to record, release to transcribe and send:
| Key | Action |
|---|---|
[space] (hold) |
Start recording |
[space] (release) |
Stop recording and transcribe |
q or Ctrl+C |
Quit |
Toggle mode
With --no-push-to-talk, press to start recording, press again to stop:
| Key | Action |
|---|---|
[space] |
Start recording |
[space] |
Stop recording and transcribe |
q or Ctrl+C |
Quit |
Subcommands
opencode-voice --port <PORT> Start voice input mode
opencode-voice setup [--model <MODEL>] Download whisper model
opencode-voice devices List available audio input devices
opencode-voice keys List available key names for --key / --hotkey
opencode-voice --help Show help
opencode-voice --version Show version
Configuration
All options can be set via CLI flags or environment variables. CLI flags take precedence.
| Flag | Env Var | Default | Description |
|---|---|---|---|
--port <n> |
OPENCODE_VOICE_PORT |
(required) | OpenCode server port |
--model <size> |
OPENCODE_VOICE_MODEL |
base.en |
Whisper model size |
--device <name> |
OPENCODE_VOICE_DEVICE |
(system default) | Audio input device name |
--key <name> |
— | space |
Toggle key for start/stop recording |
--hotkey <name> |
— | right_option |
Global hotkey (system-wide, no terminal focus needed) |
--no-global |
— | — | Disable global hotkey support |
--push-to-talk / --no-push-to-talk |
— | --push-to-talk |
Enable/disable push-to-talk mode |
--approval / --no-approval |
— | --approval |
Review transcript before sending |
Environment variables
| Variable | Description |
|---|---|
OPENCODE_VOICE_PORT |
OpenCode server port (alternative to --port) |
OPENCODE_VOICE_MODEL |
Whisper model size (alternative to --model) |
OPENCODE_SERVER_PASSWORD |
Password if OpenCode server has auth enabled |
How It Works
- Audio capture — cpal captures audio from the microphone at 16 kHz mono
- Transcription — whisper-rs (whisper.cpp bindings) transcribes the captured audio entirely on-device
- Global hotkeys — rdev provides system-wide key event listening (no terminal focus required)
- Injection — the transcribed text is sent via
POST /tui/append-promptto OpenCode's HTTP API, which inserts it into the active prompt textarea - Approval — when
--approvalis enabled, voice commands can respond to OpenCode permission and question prompts (e.g. "allow", "reject", "always")
Platform Support
| Platform | Status | Notes |
|---|---|---|
| macOS | Primary — fully tested | Requires Accessibility permission for global hotkeys |
| Linux | Supported | ALSA or PulseAudio via cpal |
| Windows | Not supported | — |
macOS: Accessibility Permission
Global hotkeys (rdev) require Accessibility access on macOS:
System Settings → Privacy & Security → Accessibility → enable for Terminal / iTerm2
Without this permission, global hotkeys will not work. Use --no-global to disable them and rely on terminal keypresses only.
Troubleshooting
Build fails: "cmake not found"
"Microphone permission denied" (macOS)
macOS requires explicit microphone permission for terminal applications:
System Settings → Privacy & Security → Microphone → enable for Terminal / iTerm2
"Cannot connect to OpenCode"
Ensure OpenCode is running with the --port flag:
Without --port, OpenCode does not expose an HTTP server and opencode-voice cannot connect.
"Whisper model not downloaded"
Run the setup command to download the model:
"No speech detected"
- Speak closer to the microphone
- Try a larger model:
--model small.en - Reduce background noise
- Check available audio devices:
opencode-voice devices
Authentication errors (401)
If OpenCode is running with a server password, set it via environment variable:
License
Apache 2.0 — see LICENSE.