Expand description
Script segmentation and bidi-safe text run partitioning.
This module provides deterministic text-run segmentation by Unicode script, bidi direction, and style — preparing robust shaping inputs and consistent cache keys for the downstream HarfBuzz shaping pipeline.
§Design
Text shaping engines (HarfBuzz, CoreText, DirectWrite) require input to be split into runs that share the same script, direction, and style. Mixing scripts in a single shaping call produces incorrect glyph selection and positioning.
This module implements a three-phase algorithm:
- Raw classification — assign each character its Unicode script via
block-range lookup (
char_script). - Common/Inherited resolution — resolve
CommonandInheritedcharacters by propagating adjacent specific scripts (UAX#24-inspired). - Run grouping — collect contiguous characters sharing the same
resolved script into
ScriptRunspans.
The TextRun type further subdivides by direction and style, producing
the atomic units suitable for shaping. RunCacheKey provides a
deterministic, hashable identifier for caching shaped glyph output.
§Example
use ftui_text::script_segmentation::{Script, ScriptRun, partition_by_script};
let runs = partition_by_script("Hello مرحبا World");
assert!(runs.len() >= 2); // At least Latin and Arabic runs
assert_eq!(runs[0].script, Script::Latin);Structs§
- RunCache
Key - Deterministic, hashable cache key for shaped glyph output.
- Script
Run - A contiguous run of characters sharing the same resolved script.
- TextRun
- A fully partitioned text run suitable for shaping.
Enums§
- RunDirection
- A text direction for run partitioning.
- Script
- Unicode script classification for shaping.
Functions§
- char_
script - Classify a character’s Unicode script via block-range lookup.
- partition_
by_ script - Partition text into contiguous runs of the same Unicode script.
- partition_
text_ runs - Partition text into fully-resolved text runs by script and direction.