voicepeak-cli
日本語 | English
A command-line interface wrapper for VOICEPEAK text-to-speech software with preset management and automatic audio playback.
Features
- Simple command-line interface:
vp "読み上げるテキスト" - Voice presets: Configure and reuse voice settings with emotions and pitch
- Automatic text splitting: Handles texts longer than 140 characters by splitting at natural break points
- Auto-play: Automatically plays generated audio with mpv (when no output file specified)
- File input support: Read text from files with
-toption - Comprehensive voice control: Narrator, emotions, speed, and pitch settings
Requirements
- macOS
- VOICEPEAK installed at
/Applications/voicepeak.app/ - mpv for audio playback (install via Homebrew:
brew install mpv) - ffmpeg for batch mode and multi-chunk file output (install via Homebrew:
brew install ffmpeg)
Installation
From crates.io (Recommended)
From source
- Clone this repository
- Build and install:
Usage
Basic Usage
# Simple text-to-speech (requires preset or --narrator)
# With explicit narrator
# Save to file instead of auto-play
# Read from file
# Pipe input
|
|
Using Presets
# List available presets
# Use a preset
# Override preset settings
Voice Controls
# Control speech parameters
# List available narrators
# List emotions for a specific narrator
Text Length Handling
# Allow automatic text splitting (default)
# Strict mode: reject texts longer than 140 characters
Playback Modes
# Batch mode: generate all chunks first, merge, then play (default)
# Sequential mode: generate and play chunks one by one
# Long text file output (uses ffmpeg to merge chunks)
# For sequential playback without ffmpeg
Configuration
Configuration is stored in ~/.config/vp/config.toml. The file is automatically created on first run.
Example Configuration
= "karin-custom"
[[]]
= "karin-custom"
= "夏色花梨"
= [
{ = "hightension", = 10 },
{ = "sasayaki", = 20 },
]
= 30
[[]]
= "karin-normal"
= "夏色花梨"
= []
[[]]
= "karin-happy"
= "夏色花梨"
= [{ = "hightension", = 50 }]
Configuration Fields
default_preset: Optional. Preset to use when no-poption is specifiedpresets: Array of voice presets
Preset Fields
name: Unique preset identifiernarrator: Voice narrator nameemotions: Array of emotion parameters withnameandvaluepitch: Optional pitch adjustment (-300 to 300)
Command-Line Options
Usage: vp [OPTIONS] [TEXT]
Arguments:
[TEXT] Text to say
Options:
-t, --text <FILE> Text file to say
-o, --out <FILE> Path of output file (optional - will play with mpv if not specified)
-n, --narrator <NAME> Name of voice
-e, --emotion <EXPR> Emotion expression (e.g., happy=50,sad=50)
-p, --preset <NAME> Use voice preset
--list-narrator Print voice list
--list-emotion <NARRATOR> Print emotion list for given voice
--list-presets Print available presets
--speed <VALUE> Speed (50 - 200)
--pitch <VALUE> Pitch (-300 - 300)
--strict-length Reject input longer than 140 characters (default: false, allows splitting)
-h, --help Print help
-V, --version Print version
Parameter Priority
When multiple sources specify the same parameter, the priority order is:
- Command-line options (highest priority)
- Preset values
- Default values / none (lowest priority)
For example:
vp "text" -p my-preset --pitch 100uses pitch=100 (CLI override)vp "text" -p my-presetuses preset's pitch valuevp "text" --narrator "voice"uses no pitch adjustment
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines on how to contribute to this project.