Expand description
§umsc
umsc is a Rust library for converting Uyghur text between multiple writing systems.
It is a port of the original Python UgMultiScriptConverter and supports conversions between:
UAS- Uyghur Arabic ScriptULS- Uyghur Latin ScriptUYS- Uyghur Yengi ScriptCTS- Common Turkic ScriptUCS- Uyghur Cyrillic ScriptXJUS- Xinjiang University English Case SensitiveUZLS- Uzbek Latin ScriptIPA- IPA output fromCTS
§Installation
Add the crate to your project:
cargo add umscOr add it manually to Cargo.toml:
[dependencies]
umsc = "1.0.0"§Usage
§Convert with the top-level function
use umsc::{convert, Script};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let arabic = "ئاپ";
let latin = convert(arabic, Script::Uas, Script::Uls)?;
assert_eq!(latin, "ap");
Ok(())
}§Reuse a converter instance
use umsc::{Script, UgMultiScriptConverter};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let converter = UgMultiScriptConverter::new(Script::Cts, Script::Uas);
let text = converter.convert("ap")?;
assert_eq!(text, "ئاپ");
Ok(())
}§Parse scripts from strings
use std::str::FromStr;
use umsc::{convert, Script};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let source = Script::from_str("uas")?;
let target = Script::from_str("ucs")?;
let text = convert("ئاپ", source, target)?;
assert_eq!(text, "ап");
Ok(())
}§Supported conversions
The library only exposes the conversion pairs implemented by the original Python logic. For example:
UAS -> CTS,ULS,UCS,UYS,UZLSULS -> CTS,UAS,UCS,UYSUYS -> CTS,UAS,ULS,UCSUCS -> CTS,UAS,ULS,UYSXJUS -> CTS,UASUZLS -> CTSCTS -> UAS,ULS,UYS,UCS,UZLS,XJUS,IPA
Unsupported pairs return an error.
§Notes
- Conversion behavior follows the original Python script as closely as practical.
- Some transliteration paths are table-driven and order-sensitive by design.
- This crate is a library only; it does not currently provide a CLI binary.