Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
SuperVox
Voice-powered productivity TUI. Live call assistant + post-call analysis + agent chat.
Modes
- Live -- real-time subtitles, translation, rolling summary, and audio recording
- Analysis -- post-call summary, action items, follow-up draft
- Agent -- chat with call history, search across past calls
- History -- browse past calls, open any call in Analysis mode
Prerequisites
- Rust 2024 edition
OPENAI_API_KEYenvironment variable (for realtime STT)- macOS for system audio capture (
system-audio-tapbinary)
Quick Start
Usage
# Use local Ollama instead of cloud LLM
Global keybindings
| Key | Action |
|---|---|
? |
Show help overlay with all keybindings for current mode |
Ctrl+C |
Quit immediately |
Live mode
| Key | Action |
|---|---|
r |
Start recording |
s |
Stop recording |
h |
Open call history (when idle) |
q |
Quit (when idle) |
Speaker labels are color-coded: You (cyan) and Them (yellow).
Audio is automatically saved as WAV (16-bit PCM mono) alongside the call JSON in ~/.supervox/calls/. Use supervox play <call-id> or press p in Analysis mode to listen back.
Analysis mode
Opens a call JSON file, runs LLM analysis automatically (summary, action items, mood, themes).
| Key | Action |
|---|---|
f |
Generate follow-up email |
c |
Copy analysis to clipboard |
C |
Copy follow-up to clipboard |
e |
Export call + analysis as markdown to clipboard |
p |
Play audio recording (if available) |
h |
Open call history |
| Arrow keys | Scroll |
q |
Quit |
History mode
Browse and manage past calls.
| Key | Action |
|---|---|
↑/↓/j/k |
Navigate |
Enter |
Open in Analysis |
d |
Delete call (y/n confirmation) |
Esc |
Back to previous mode |
q |
Quit |
Agent mode
Chat with your call history. The agent loads the last 10 calls as context and streams LLM responses in real-time.
| Key | Action |
|---|---|
| Type + Enter | Send question |
| Esc | Quit |
Tech Stack
| Layer | Technology |
|---|---|
| Voice pipeline | voxkit (STT, VAD, TTS, mic, system audio) |
| LLM agent | sgr-agent (tool calling, sessions, compaction) |
| TUI | ratatui + sgr-agent-tui |
| Real-time STT | OpenAI Realtime WebSocket |
| LLM | Gemini Flash / OpenRouter / Ollama |
Config
Config is loaded from ~/.supervox/config.toml at startup. A default is created if missing.
# ~/.supervox/config.toml
= "ru" # target language for translation/summary
= "realtime" # "realtime" (WebSocket) | "openai" (batch)
= "gemini-2.5-flash" # model for translation + summary
= 5 # rolling summary interval
= "mic+system" # "mic" | "mic+system"
= "auto" # "auto" | "ollama"
= "llama3.2:3b" # model when llm_backend = "ollama"
System audio setup (macOS)
System audio capture uses ScreenCaptureKit via the system-audio-tap helper binary.
If unavailable, SuperVox falls back to mic-only mode automatically.
License
MIT