Function atoms

Source
pub fn atoms(s: &str) -> Vec<&str>
Expand description

Splits the input string into layout atoms โ€” visual units used for width-aware layout.

This is a runefix-specific segmentation, based on actual display width, not linguistic boundaries. It differs from [graphemes()] (which follows Unicode UAX #29) by focusing purely on units that affect layout:

  • Characters with width = 0 (e.g., combining marks, control codes) are grouped with their leading base
  • Emoji sequences (e.g. ZWJ, variation selectors) are preserved as atomic units
  • Output is suitable for TUI rendering, Markdown table layout, and CLI alignment

ยงExample

use runefix_core::atoms;
assert_eq!(atoms("๐Ÿ‘ฉโ€โค๏ธโ€๐Ÿ’‹โ€๐Ÿ‘จ"), vec!["๐Ÿ‘ฉ", "\u{200d}", "โค", "\u{fe0f}", "\u{200d}", "๐Ÿ’‹", "\u{200d}", "๐Ÿ‘จ"]);

ยงNote

This function is not Unicode-compliant segmentation. For that, see [graphemes()].