Active Call

active-call is a standalone Rust crate for building AI Voice Agents. It provides high-performance infrastructure bridging AI models with real-world telephony and web communications.

📖 Documentation → English | 中文 | API Reference

Key Capabilities

1. Multi-Protocol Audio Gateway

SIP (Telephony): UDP, TCP, TLS (SIPS), WebSocket. Register as extension to FreeSWITCH / Asterisk / RustPBX, or handle direct SIP calls. PSTN via Twilio and Telnyx.
WebRTC: Browser-to-agent SRTP. (Requires HTTPS or 127.0.0.1)
Voice over WebSocket: Push raw PCM/encoded audio, receive real-time events.

2. Dual-Engine Dialogue

Traditional Pipeline: VAD → ASR → LLM → TTS. Supports OpenAI, Aliyun, Azure, Tencent and more.
Realtime Streaming: Native OpenAI/Azure Realtime API — full-duplex, ultra-low latency.

3. Playbook — Stateful Voice Agents

Define personas, scenes, and flows in Markdown files:

---
asr:
  provider: "sensevoice"
tts:
  provider: "supertonic"
  speaker: "F1"
llm:
  provider: "openai"
  model: "${OPENAI_MODEL}"
  apiKey: "${OPENAI_API_KEY}"
  features: ["intent_clarification", "emotion_resonance"]
dtmf:
  "0": { action: "hangup" }
posthook:
  url: "https://api.example.com/webhook"
  summary: "detailed"
---

# Scene: greeting
<dtmf digit="1" action="goto" scene="tech_support" />

You are a friendly AI for {{ company_name }}. Greet the caller warmly.

# Scene: tech_support
How can I help with your system? I can transfer you: <refer to="sip:human@domain.com" />

💡 ${VAR} = environment variables (config-time). {{var}} = runtime variables (per-call).

4. Offline AI (Privacy-First)

Run ASR and TTS locally — no cloud API required:

Offline ASR: SenseVoice — zh, en, ja, ko, yue
Offline TTS: Supertonic — en, ko, es, pt, fr

# Download models
docker run --rm -v $(pwd)/data/models:/models \
  ghcr.io/miuda-ai/active-call:latest \
  --download-models all --models-dir /models --exit-after-download

# Run with offline models
docker run -d --net host \
  -v $(pwd)/data/models:/app/models \
  -v $(pwd)/config:/app/config \
  ghcr.io/miuda-ai/active-call:latest

Mainland China: Add -e HF_ENDPOINT=https://hf-mirror.com to use the HuggingFace mirror.

5. High-Performance Media Core

VAD Engine	Time (60s audio)	RTF	Note
TinySilero	~60 ms	0.0010	>2.5× faster ONNX
ONNX Silero	~158 ms	0.0026	Standard baseline
WebRTC VAD	~3 ms	0.00005	Legacy

Codec support: PCM16, G.711 (PCMU/PCMA), G.722, Opus.

Quick Start

# Webhook handler
./active-call --handler https://example.com/webhook

# Playbook handler
./active-call --handler config/playbook/greeting.md

# Outbound SIP call
./active-call --call sip:1001@127.0.0.1:5060 --handler greeting.md

# With external IP and codecs
./active-call --handler default.md --external-ip 1.2.3.4 --codecs pcmu,pcma,opus

Docker

docker run -d --net host \
  --name active-call \
  -v $(pwd)/config.toml:/app/config.toml:ro \
  -v $(pwd)/config:/app/config \
  ghcr.io/miuda-ai/active-call:latest

Playbook Handler Routing

[handler]
type = "playbook"
default = "greeting.md"

[[handler.rules]]
caller = "^\\+1\\d{10}$"
callee = "^sip:support@.*"
playbook = "support.md"

[[handler.rules]]
caller = "^\\+86\\d+"
playbook = "chinese.md"

SIP Carrier Integration

TLS + SRTP (Required by Twilio)

tls_port      = 5061
tls_cert_file = "./certs/cert.pem"
tls_key_file  = "./certs/key.pem"
enable_srtp   = true

Environment Variables

# OpenAI / Azure
OPENAI_API_KEY=sk-...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

# Aliyun DashScope
DASHSCOPE_API_KEY=sk-...

# Tencent Cloud
TENCENT_APPID=...
TENCENT_SECRET_ID=...
TENCENT_SECRET_KEY=...

# Offline models
OFFLINE_MODELS_DIR=/path/to/models

Demo

Playbook demo

SDKs

Go: rustpbxgo — Official Go client

Documentation

Language	Links
English	Docs Hub · API Reference · Config Guide · Playbook Tutorial · Advanced Features
中文	文档中心 · API 文档 · 配置指南 · Playbook 教程 · 高级特性

License

MIT — see LICENSE

active-call 0.3.43