shabdakosh
shabdakosh (Sanskrit: dictionary) — Pronunciation dictionary crate for AGNOS.
Maps words to svara Phoneme sequences using a 5,000+ entry English dictionary derived from the CMU Pronouncing Dictionary.
Features
- 5,000+ entry English dictionary generated at compile time from CMUdict (zero runtime parsing)
- ARPABET mapping — bidirectional conversion between ARPABET notation and svara phonemes
- User overlay — application-specific entries that override the base dictionary
- Import/export — CMUdict text format (no_std) and JSON (with
jsonfeature) - no_std compatible — works with
alloc, no standard library required
Quick Start
use PronunciationDict;
let dict = english;
assert!;
assert!;
User Overlay
Override or extend the built-in dictionary with application-specific pronunciations:
use PronunciationDict;
use Phoneme;
let mut dict = english;
// Add a custom word
dict.insert_user;
// User entries take precedence over base entries
assert!;
Import/Export
use format;
// Parse CMUdict format
let input = "hello HH AH0 L OW1\nworld W ER1 L D\n";
let dict = parse_cmudict.unwrap;
// Export back to CMUdict format
let output = to_cmudict;
Feature Flags
| Feature | Default | Description |
|---|---|---|
std |
Yes | Standard library support. Disable for no_std + alloc |
json |
No | JSON import/export via serde_json |
Architecture
shabdakosh
/ \
arpabet dictionary
(ARPABET <-> Phoneme) / \
mod.rs format.rs
(PronunciationDict, (CMUdict/JSON
user overlay, import/export)
generated 5K dict)
Consumers
License
GPL-3.0-only