Crate unicode_width

source ·
Expand description

Determine displayed width of char and str types according to Unicode Standard Annex #11, other portions of the Unicode standard, and common implementations of POSIX wcwidth(). See the Rules for determining width section for the exact rules.

This crate is #![no_std].

use unicode_width::UnicodeWidthStr;

let teststr = "Hello, world!";
let width = UnicodeWidthStr::width(teststr);
println!("{}", teststr);
println!("The above string is {} columns wide.", width);
let width = teststr.width_cjk();
println!("The above string is {} columns wide (CJK).", width);

§Rules for determining width

This crate currently uses the following rules to determine the width of a character or string, in order of decreasing precedence. These may be tweaked in the future.

  1. Emoji presentation sequences have width 2. (The width of a string may therefore differ from the sum of the widths of its characters.)
  2. '\u{00AD}' SOFT HYPHEN has width 1.
  3. '\u{115F}' HANGUL CHOSEONG FILLER has width 2.
  4. The following have width 0:
  5. The control characters have no defined width, and are ignored when determining the width of a string.
  6. Characters with an East_Asian_Width of Fullwidth (F) or Wide (W) have width 2.
  7. Characters with an East_Asian_Width of Ambiguous (A) have width 2 in an East Asian context, and width 1 otherwise.
  8. All other characters have width 1.

§Canonical equivalence

The non-CJK width methods guarantee that canonically equivalent strings are assigned the same width. However, this guarantee does not currently hold for the CJK width variants.