voice-echo
Voice interface for Claude Code over the phone. Call in and talk to Claude, or trigger outbound calls from n8n / automation workflows.
Built in Rust. Uses Twilio for telephony, Groq Whisper for speech-to-text, Inworld for text-to-speech, and the Claude Code CLI for reasoning.
Architecture
Voice Pipeline
┌─────────────────────────────────────┐
│ voice-echo (axum) │
│ │
Phone ◄──► Twilio ◄──►│ WebSocket ◄──► VAD ──► STT (Groq) │
│ │ │
│ Claude CLI │
│ │ │
│ TTS (Inworld) │
│ mulaw 8kHz │
└─────────────────────────────────────┘
AI-Initiated Outbound Calls (n8n Bridge)
┌──────────────┐ trigger ┌──────────────────┐
│ Any n8n │──────────────►│ Orchestrator │
│ workflow │ │ reads registry, │
│ (alerts, │ │ routes to │
│ cron, │ │ target module │
│ events...) │ └────────┬─────────┘
└──────────────┘ │
▼
┌──────────────────┐
│ call-human │
│ builds request, │
│ passes context │
└────────┬─────────┘
│
│ POST /api/call
│ { to, context }
▼
┌──────────────────┐
│ voice-echo │
│ stores context │──► Twilio ──► Phone rings
│ per call_sid │
└────────┬─────────┘
│
│ caller picks up
▼
┌──────────────────┐
│ Claude CLI │
│ first prompt │
│ includes context │
│ "I'm calling │
│ because..." │
└──────────────────┘
Full System
┌──────────┐
┌─────────┐ │ n8n │ ┌───────────────┐
│ Triggers │──────►│ (Docker) │──────►│ voice-echo │──► Claude CLI
│ (cron, │ │ │ API │ (Rust, axum) │
│ webhook,│ │ orchest.│ └───────┬───────┘
│ alerts, │ │ call- │ │
│ events) │ │ human │ ▼
└─────────┘ └──────────┘ ┌───────────────┐
│ Twilio │◄──► Phone
└───────────────┘
Prerequisites
- Rust (1.80+)
- Claude Code CLI installed and authenticated
- Twilio account with a phone number
- Groq API key (free tier works)
- Inworld API key (sign up at platform.inworld.ai)
- A server with a public HTTPS URL (for Twilio webhooks)
- nginx (recommended, for TLS termination and WebSocket proxying)
Installation
1. Clone and build
2. Run the setup wizard
The wizard walks you through the entire setup:
- Checks that
rustc,claude, andopensslare available - Prompts for Twilio, Groq, and Inworld credentials (masked input)
- Asks for your server's external URL
- Generates an API token for the outbound call endpoint
- Writes
~/.voice-echo/config.toml - Optionally copies the binary to
/usr/local/bin/, installs a systemd service, and generates an nginx reverse proxy config
If you skip the optional steps during the wizard, you can always set them up manually using the templates in deploy/.
3. Twilio webhook
In the Twilio Console, set your phone number's voice webhook to:
POST https://your-server.example.com/twilio/voice
4. Start
Or if you installed the systemd service:
Manual configuration
If you prefer to skip the wizard and configure by hand:
Edit .env with your API keys, and config.toml for your Twilio phone number and other settings. Secrets are loaded from .env, so leave them empty in the TOML. See deploy/nginx.conf and deploy/voice-echo.service for server setup templates.
You can override the config directory with ECHO_CONFIG=/path/to/config.toml.
Configuration Reference
config.toml
| Section | Field | Default | Description |
|---|---|---|---|
server |
host |
-- | Bind address (e.g. 0.0.0.0) |
server |
port |
-- | Bind port (e.g. 8443) |
server |
external_url |
-- | Public HTTPS URL (overridden by SERVER_EXTERNAL_URL env var) |
twilio |
account_sid |
-- | Twilio Account SID (overridden by env var) |
twilio |
auth_token |
-- | Twilio Auth Token (overridden by env var) |
twilio |
phone_number |
-- | Your Twilio phone number (E.164) |
groq |
api_key |
-- | Groq API key (overridden by env var) |
groq |
model |
whisper-large-v3-turbo |
Whisper model to use |
inworld |
api_key |
-- | Inworld API key (overridden by env var) |
inworld |
voice_id |
Olivia |
Inworld voice name |
inworld |
model |
inworld-tts-1.5-max |
Inworld TTS model |
claude |
session_timeout_secs |
300 |
Conversation session timeout |
claude |
greeting |
Hello, this is Echo |
Initial TTS greeting when a call connects |
claude |
dangerously_skip_permissions |
false |
Allow Claude CLI to run tools without prompting (see Customizing Claude) |
api |
token |
-- | Bearer token for /api/* (overridden by env var) |
vad |
silence_threshold_ms |
1500 |
Silence duration before utterance ends |
vad |
energy_threshold |
50 |
Minimum RMS energy to detect speech |
hold_music |
file |
-- | Optional path to a WAV file for hold music |
hold_music |
volume |
0.3 |
Playback volume (0.0 to 1.0) |
Environment variables
All secrets can be set via env vars (recommended) instead of config.toml:
| Variable | Overrides |
|---|---|
TWILIO_ACCOUNT_SID |
twilio.account_sid |
TWILIO_AUTH_TOKEN |
twilio.auth_token |
GROQ_API_KEY |
groq.api_key |
INWORLD_API_KEY |
inworld.api_key |
ECHO_API_TOKEN |
api.token |
SERVER_EXTERNAL_URL |
server.external_url |
ECHO_CONFIG |
Config file path |
RUST_LOG |
Log level filter (e.g. voice_echo=debug,tower_http=debug) |
Customizing Claude
voice-echo spawns the claude CLI for each conversation. Claude Code reads a CLAUDE.md file from the working directory to set its behavior — this is how you turn generic Claude into your personalized voice assistant.
Create a CLAUDE.md in the directory where voice-echo runs (typically the project root or the home directory of the service user). This file should contain instructions tailored for a voice context:
- Persona: Define who Claude is on the phone — name, tone, personality.
- Voice-first rules: Tell Claude to never use markdown, bullet points, numbered lists, or any text formatting. Everything it outputs will be spoken aloud via TTS.
- Brevity: Phone calls are not lectures. Two to four sentences per response is usually enough.
- Language: If you want multilingual support, specify which languages and when to switch.
- Capabilities: Define what Claude can and can't do — run commands, access APIs, check services, etc.
- Boundaries: Set security rules, topics to avoid, or information to never disclose.
Without a CLAUDE.md, Claude will behave as its default self — functional but generic.
Permissions
Claude Code normally prompts for permission before running tools (shell commands, file edits, etc.). On a phone call there's no terminal to approve prompts, so you have two options:
dangerously_skip_permissions = truein config.toml — Claude runs all tools without asking. Powerful but risky. Only use this if you trust the instructions in yourCLAUDE.mdand have locked down what Claude can access.- Pre-approve tools via Claude Code's
settings.jsonorallowedToolsconfiguration. This gives you granular control over which tools are auto-approved without blanket permission.
See the Claude Code documentation for details on permission configuration.
Server Setup
TLS certificates
Twilio requires HTTPS for webhooks. If you're using nginx (recommended), get a free certificate with certbot:
Certificates auto-renew via a systemd timer. The nginx template in deploy/nginx.conf is already configured for the Let's Encrypt certificate paths.
systemd
The included service file (deploy/voice-echo.service) runs as root for simplicity. For production, consider creating a dedicated user:
Then update User=voice-echo in the service file and ensure the user has read access to ~/.voice-echo/ and the claude CLI.
Usage
Call in
Just call your Twilio number. You'll hear the configured greeting, then talk normally.
Trigger an outbound call
The recipient picks up and Claude already knows why it called — context is injected into the first prompt.
POST /api/call
Requires Authorization: Bearer <token> header.
| Field | Type | Required | Description |
|---|---|---|---|
to |
string | yes | Phone number in E.164 format (e.g. +34612345678) |
context |
string | no | Injected into Claude's first prompt so it knows why it's calling |
message |
string | no | Twilio <Say> greeting before the stream starts (usually not needed since Claude handles the greeting via TTS) |
n8n Bridge
voice-echo integrates with n8n through a bridge architecture:
- Orchestrator -- central webhook that routes triggers to registered modules
- Modules -- individual workflows managed via a JSON registry
- call-human -- module that triggers outbound calls with context
Trigger a call from any n8n workflow via the orchestrator:
The orchestrator reads the module registry, forwards the payload to the call-human webhook, which calls the voice-echo API with context. When the user picks up, Claude knows exactly what's happening.
Any n8n workflow can trigger calls by routing through the orchestrator. See specs/n8n-bridge-spec.md for the full specification.
Costs
| Service | Free tier | Paid |
|---|---|---|
| Twilio | Trial credit (~$15) | ~$1.15/mo number + per-minute |
| Groq | Free (rate-limited) | Usage-based |
| Inworld TTS | Free tier available | ~$5/1M chars |
| Claude Code | Included with Max plan | Or API usage |
For personal use with a few calls a day, the running cost is minimal beyond the Twilio number.
Troubleshooting
Twilio returns a 502 or "connection refused"
Twilio can't reach your server. Verify nginx is running, your DNS points to the server, and the TLS certificate is valid. Test with curl -I https://your-server.example.com/health.
WebSocket closes immediately
Check that nginx has WebSocket proxying enabled (the Upgrade and Connection headers in deploy/nginx.conf). Also check proxy_read_timeout — Twilio media streams are long-lived.
"Failed to load config" on startup
The config file is missing or malformed. Run voice-echo --setup to generate it, or manually copy config.example.toml to ~/.voice-echo/config.toml.
Claude doesn't respond or times out
Make sure the claude CLI is installed, in PATH, and authenticated. Run claude --version and claude "hello" manually to verify. If running as a systemd service, ensure the service user's PATH includes the Claude binary.
No audio / silence after speaking
The VAD energy threshold may be too high for your microphone or phone quality. Lower vad.energy_threshold (try 30 or 20). Check RUST_LOG=voice_echo=debug for VAD activity logs.
TTS sounds robotic or uses the wrong voice
Verify your inworld.voice_id is valid. Preview voices at the Inworld TTS Playground. You can also create custom voices in Inworld Studio.
Contributing
See CONTRIBUTING.md for branch naming, commit conventions, and workflow.
License
Acknowledgments
Inspired by NetworkChuck's claude-phone. Rewritten from scratch in Rust with a different architecture -- no intermediate Node.js server, direct WebSocket pipeline, energy-based VAD, and an outbound call API for automation.