Varna
Varna (Sanskrit: वर्ण — letter, character, sound) — multilingual language engine for AGNOS
Structured, queryable corpus of human language data. Phoneme inventories, writing system metadata, grammar profiles, and lexicon access for 50+ languages.
Used by shabda (G2P conversion), shabdakosh (pronunciation dictionary), svara (vocal synthesis), jnana (knowledge system), and vidya (programming reference).
Modules
| Module | Description |
|---|---|
phoneme |
IPA phoneme inventories per language, articulatory features (manner, place, voicing), stress/tone patterns |
script |
Writing system metadata: alphabet, syllabary, logographic, abjad, abugida. Unicode ranges, directionality |
grammar |
Morphological typology (isolating, agglutinative, fusional, polysynthetic), word order, case systems |
lexicon |
Core vocabulary per language (Swadesh lists, frequency-ranked word lists), cognate detection |
Features
| Feature | Default | Description |
|---|---|---|
std |
yes | Standard library support |
logging |
no | Structured logging via LIPI_LOG env var |
full |
-- | Enables all features |
Quick Start
[]
= "0.1"
use ;
// Get the English phoneme inventory
let en = english;
assert!; // voiceless dental fricative (think)
assert!; // voiced dental fricative (this)
assert!; // no uvular trill in English
assert_eq!;
println!;
// Look up a specific phoneme
let sh = en.find.unwrap; // postalveolar fricative (ship)
Architecture
varna (this) — language structure & phoneme inventories
| provides phoneme sets per language
shabda — G2P conversion (currently English-only, varna makes it multilingual)
| produces phoneme sequences
shabdakosh — pronunciation dictionary (currently CMUdict, varna adds IPA dicts)
| lookup fallback
svara — vocal synthesis (consumes phonemes, produces audio)
| voice output
dhvani — audio engine (mixing, DSP, output)
Also feeds:
- jnana — multilingual knowledge access
- vidya — programming concepts explained in native languages
- vansh (planned) — voice assistant with multilingual TTS/STT
- sahifa (planned) — OCR language detection, multilingual document processing
Development
License
GPL-3.0-only. See LICENSE.