voice-g2p
English grapheme-to-phoneme conversion for Kokoro TTS. A Rust port of misaki's English G2P pipeline.
Install
[]
= "0.1"
Usage
// Convert text to Kokoro-compatible phonemes
let phonemes = english_to_phonemes?;
// => "həlˈO wˈɜɹld"
// For long text, split into chunks that fit the model's 510-token limit
let chunks = text_to_phoneme_chunks?;
for chunk in &chunks
Custom configuration
If uv or espeak-ng aren't on your $PATH:
let config = G2PConfig ;
let g2p = G2Pwith_config;
let phonemes = g2p.convert?;
What's inside
- Dictionary lookup — 90k gold + 93k silver pronunciation entries embedded at compile time
- Morphological decomposition —
-s,-ed,-ingsuffix rules with voicing logic - Number handling — cardinals, ordinals, years, currency, phone numbers
- POS tagging — optional spaCy subprocess (via
uv run) for context-dependent pronunciation - Fallback — espeak-ng per-word for unknown words
Optional dependencies
- espeak-ng — fallback pronunciation for words not in the dictionary (
brew install espeak-ng) - uv — runs spaCy for POS-based disambiguation (e.g. "read" as past vs. present tense)
Both are optional. Without them, the pipeline still works using dictionary lookup alone.
License
MIT