pub enum Script {
Show 15 variants
Arabic,
Hebrew,
Devanagari,
Bengali,
Tamil,
Gurmukhi,
Gujarati,
Telugu,
Kannada,
Malayalam,
Oriya,
Sinhala,
Khmer,
Thai,
Other,
}Expand description
Coarse script classification used to decide which feature tag list
the shaper applies to a run. Only the scripts that need contextual
shaping are enumerated; everything else collapses to Other.
Variants§
Arabic
Arabic block + supplements + presentation forms (U+0600..U+06FF, U+0750..U+077F, U+08A0..U+08FF, U+FB50..U+FDFF, U+FE70..U+FEFF).
Hebrew
Hebrew block + Alphabetic Presentation Forms-A Hebrew range (U+0590..U+05FF, U+FB1D..U+FB4F).
Devanagari
Devanagari block (U+0900..U+097F). Hindi / Marathi / Sanskrit /
Nepali. Round 8 added cluster-based shaping — see
super::indic for the cluster machine and
super::indic::devanagari_feature_tags for the
substitution-feature application order.
Bengali
Bengali block (U+0980..U+09FF). Bengali / Assamese / Manipuri. Round 10 added cluster-based shaping — same broad shape as Devanagari (halant-driven conjuncts, reph rule for RA U+09B0, pre-base matra reorder) but Bengali has THREE pre-base matras (U+09BF / U+09C7 / U+09C8) instead of Devanagari’s one.
Tamil
Tamil block (U+0B80..U+0BFF). Tamil. Round 10 added minimal cluster-based shaping: pre-base matra reorder (U+0BC6 / U+0BC7 / U+0BC8) only — no reph (Tamil RA renders in-line), no nukta, no conjunct formation in the modern orthography.
Gurmukhi
Gurmukhi block (U+0A00..U+0A7F). Punjabi. Round 11 added
halant-driven cluster machine: pre-base matra reorder
(U+0A3F sign “i”); reph rare in modern usage (RA U+0A30 sets
the flag for fonts that ship a rphf lookup, callers without
one fall back to in-line RA rendering).
Gujarati
Gujarati block (U+0A80..U+0AFF). Gujarati. Round 11 added — closest in shape to Devanagari (halant-driven conjuncts; pre-base matra U+0ABF; reph rule on RA U+0AB0).
Telugu
Telugu block (U+0C00..U+0C7F). Telugu. Round 11 added — reph identification on RA U+0C30 plus pre-base matra reorder for U+0C46 / U+0C47 / U+0C48 (e / ee / ai). The Telugu split vowels (U+0C46 + U+0C56) decompose to a pre-base + post-base pair under NFD; the cluster machine flags the pre-base component for reorder.
Kannada
Kannada block (U+0C80..U+0CFF). Kannada. Round 11 added — similar shape to Telugu (reph on RA U+0CB0; pre-base matras U+0CC6 / U+0CC7 / U+0CC8) with its own codepoints + halant (U+0CCD).
Malayalam
Malayalam block (U+0D00..U+0D7F). Malayalam. Round 11 added — pre-base matras U+0D46 / U+0D47 / U+0D48 plus the chillu (half-form) characters U+0D7A..U+0D7F treated as consonants (they are NFC-stable independent codepoints in modern Malayalam orthography). No reph in modern Malayalam — chillu replaces the historic reph rendering.
Oriya
Oriya block (U+0B00..U+0B7F). Oriya / Odia. Round 11 added — reph identification on RA U+0B30 plus pre-base matra reorder for U+0B47 / U+0B48 / U+0B4B / U+0B4C (Oriya is unusual in that the precomposed o / au matras are themselves pre-base after canonical decomposition). Halant U+0B4D drives conjuncts.
Sinhala
Sinhala block (U+0D80..U+0DFF). Sinhala. Round 12 (Brahmic non-Indic) — closest to Indic in shape. Halant / al-lakuna U+0DCA drives conjuncts; pre-base matras U+0DD9..U+0DDB (e / ee / ai) plus the precomposed two-part vowels U+0DDC..U+0DDE (o / oo / au) reorder to the front of the cluster. No reph (Sinhala has no superscript reph rendering).
Khmer
Khmer block (U+1780..U+17FF). Khmer / Cambodian. Round 12 added — coeng (U+17D2) plays the role of halant and stacks subjoined consonants underneath the base; subjoined chains are commonly 2-3 deep in Pali borrowings. Pre-base matras U+17BE / U+17BF / U+17C0..U+17C5 reorder to the front of the cluster. No reph.
Thai
Thai block (U+0E00..U+0E7F). Thai. Round 12 added — no halant and no conjunct formation; pre-base vowels U+0E40..U+0E44 already appear in storage / keyboard order BEFORE their consonant (the one Indic-family script where this is the case), so no reorder is needed — the cluster machine simply starts a new cluster at each pre-base vowel. Tone marks U+0E48..U+0E4B
- signs U+0E4C..U+0E4E attach to the cluster end.
Other
Anything else — Latin, CJK, Cyrillic, Greek, etc.