umsc 1.0.0

Uyghur multi-script converter for Arabic, Latin, Yengi, Cyrillic, XJUS, and Uzbek Latin scripts
Documentation
  • Coverage
  • 4.35%
    1 out of 23 items documented1 out of 12 items with examples
  • Size
  • Source code size: 21.4 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 2.63 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 1m 33s Average build duration of successful builds.
  • all releases: 1m 5s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • Alimjoo

umsc

umsc is a Rust library for converting Uyghur text between multiple writing systems.

It is a port of the original Python UgMultiScriptConverter and supports conversions between:

  • UAS - Uyghur Arabic Script
  • ULS - Uyghur Latin Script
  • UYS - Uyghur Yengi Script
  • CTS - Common Turkic Script
  • UCS - Uyghur Cyrillic Script
  • XJUS - Xinjiang University English Case Sensitive
  • UZLS - Uzbek Latin Script
  • IPA - IPA output from CTS

Installation

Add the crate to your project:

cargo add umsc

Or add it manually to Cargo.toml:

[dependencies]
umsc = "1.0.0"

Usage

Convert with the top-level function

use umsc::{convert, Script};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let arabic = "ئاپ";
    let latin = convert(arabic, Script::Uas, Script::Uls)?;

    assert_eq!(latin, "ap");
    Ok(())
}

Reuse a converter instance

use umsc::{Script, UgMultiScriptConverter};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let converter = UgMultiScriptConverter::new(Script::Cts, Script::Uas);
    let text = converter.convert("ap")?;

    assert_eq!(text, "ئاپ");
    Ok(())
}

Parse scripts from strings

use std::str::FromStr;
use umsc::{convert, Script};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let source = Script::from_str("uas")?;
    let target = Script::from_str("ucs")?;
    let text = convert("ئاپ", source, target)?;

    assert_eq!(text, "ап");
    Ok(())
}

Supported conversions

The library only exposes the conversion pairs implemented by the original Python logic. For example:

  • UAS -> CTS, ULS, UCS, UYS, UZLS
  • ULS -> CTS, UAS, UCS, UYS
  • UYS -> CTS, UAS, ULS, UCS
  • UCS -> CTS, UAS, ULS, UYS
  • XJUS -> CTS, UAS
  • UZLS -> CTS
  • CTS -> UAS, ULS, UYS, UCS, UZLS, XJUS, IPA

Unsupported pairs return an error.

Notes

  • Conversion behavior follows the original Python script as closely as practical.
  • Some transliteration paths are table-driven and order-sensitive by design.
  • This crate is a library only; it does not currently provide a CLI binary.