1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
//! Basic grapheme-aware width processing functions.
//!
//! This module provides the core, always-available APIs for:
//!
//! - Unicode grapheme segmentation
//! - Terminal-style display width measurement
//! - Safe truncation and line wrapping
//!
//! These functions use a default [`terminal`](crate::policy::WidthPolicy::terminal) layout strategy,
//! without requiring any additional features.
//!
//! See [`policy_ext`](crate::grapheme::policy_ext) for configurable width behavior.
use crateget_display_width;
use UnicodeSegmentation;
/// Returns all Unicode grapheme clusters in the input string, following UAX #29.
///
/// A **grapheme cluster** is the smallest unit of text that a user perceives as a single character.
/// This function implements [Unicodeยฎ Standard Annex #29](https://unicode.org/reports/tr29/),
/// including support for extended grapheme clusters such as:
///
/// - Emoji ZWJ sequences (e.g., ๐ฉโโค๏ธโ๐โ๐จ)
/// - Hangul syllables
/// - Combining accents (e.g., eฬ)
///
/// This API is Unicode-compliant and suitable for user-facing string segmentation.
///
/// # Arguments
///
/// * `s` โ The input string to split.
///
/// # Returns
///
/// A `Vec<&str>` where each item is a Unicode grapheme cluster.
///
/// # Example
///
/// ```rust
/// use runefix_core::graphemes;
///
/// let clusters = graphemes("Love๐ฉโโค๏ธโ๐โ๐จ็ฑ");
/// assert_eq!(clusters, vec!["L", "o", "v", "e", "๐ฉโโค๏ธโ๐โ๐จ", "็ฑ"]);
/// ```
/// Returns the total display width (in columns) of a string, based on grapheme clusters.
///
/// This function segments the input string into Unicode grapheme clusters and sums
/// the display width of each one using [`display_width`]. The result reflects
/// how much horizontal space the entire string occupies in a monospace terminal,
/// accounting for wide characters such as CJK ideographs and emoji.
///
/// # Arguments
///
/// * `s` - The input string to measure
///
/// # Returns
///
/// The total display width of the string in terminal columns.
///
/// # Example
///
/// ```rust
/// use runefix_core::display_width;
///
/// let width = display_width("Hi๏ผไธ็");
/// assert_eq!(width, 8); // 1 + 1 + 2 + 2 + 2
/// ```
/// Returns the display width (in columns) of each grapheme cluster in the input string.
///
/// This function segments the input string into Unicode grapheme clusters and computes
/// the display width of each one individually. It is useful for scenarios like monospace
/// text layout, visual alignment, or rendering terminals where East Asian characters
/// and emoji take more than one column.
///
/// # Arguments
///
/// * `s` - The input string to analyze
///
/// # Returns
///
/// A vector of display widths (`usize`) for each grapheme cluster in order.
///
/// # Example
///
/// ```rust
/// use runefix_core::display_widths;
///
/// let widths = display_widths("Hi๏ผไธ็");
/// assert_eq!(widths, vec![1, 1, 2, 2, 2]);
/// ```
/// Returns the display width of each grapheme cluster in the input string.
///
/// This function splits the string into Unicode grapheme clusters and pairs
/// each one with its terminal display width (in columns). This is useful for
/// visually aligned rendering, layout calculation, and Unicode debugging,
/// especially with complex emoji or East Asian characters.
///
/// # Arguments
///
/// * `s` - The input string to analyze
///
/// # Returns
///
/// A vector of tuples, where each item is a grapheme cluster and its
/// corresponding display width: `(&str, usize)`
///
/// # Example
///
/// ```rust
/// use runefix_core::grapheme_widths;
///
/// let result = grapheme_widths("Hi๏ผไธ็");
/// assert_eq!(
/// result,
/// vec![("H", 1), ("i", 1), ("๏ผ", 2), ("ไธ", 2), ("็", 2)]
/// );
/// ```
/// Truncates a string by display width while preserving grapheme cluster boundaries.
///
/// This function ensures that wide characters such as emoji or CJK ideographs are
/// never split in the middle. It safely cuts off the string so that its total
/// display width does not exceed the given `max_width`, making it ideal for
/// terminal or TUI rendering.
///
/// # Arguments
///
/// * `s` - The input string to truncate
/// * `max_width` - Maximum allowed display width in terminal columns
///
/// # Returns
///
/// A string slice that fits within the specified display width without cutting graphemes.
///
/// # Example
///
/// ```rust
/// use runefix_core::truncate_by_width;
///
/// let s = "Hi ๐๏ผไธ็";
/// let short = truncate_by_width(s, 6);
/// assert_eq!(short, "Hi ๐");
/// ```
/// Splits a string into lines based on display width, preserving grapheme boundaries.
///
/// This function ensures that wide characters such as emoji, CJK ideographs, or
/// fullwidth punctuation are not split mid-grapheme. It breaks the input string
/// into a sequence of lines, each with a total display width that does not exceed
/// the given `max_width`. Ideal for terminal word wrapping and monospace layout.
///
/// # Arguments
///
/// * `s` - The input string to wrap
/// * `max_width` - Maximum display width (in columns) for each line
///
/// # Returns
///
/// A vector of strings, each representing a wrapped line within the given width.
///
/// # Example
///
/// ```rust
/// use runefix_core::split_by_width;
///
/// let lines = split_by_width("Hello ๐ ไธ็๏ผ", 5);
/// assert_eq!(lines, vec!["Hello", " ๐ ", "ไธ็", "๏ผ"]);
/// ```