Module grapheme_clusters

Source
Expand description

This module provides two interfaces for accessing clusters from an underlying string. The GraphemeCluster trait extends the Peekable iterators over Chars or CharIndices to add a next_cluster method which returns Option<String> with the next cluster if one exists. This is the best method for getting individual clusters from a stream which is normally only getting chars but is not recommended if you wish to iterate over clusters.

let mut char_iterator = "A\u{301}โœ‹๐Ÿฝ๐Ÿ‡ฆ๐Ÿ‡น!".chars().peekable();
assert_eq!(char_iterator.next_cluster(), Some("A\u{301}".to_string()));
assert_eq!(char_iterator.next_cluster(), Some("โœ‹๐Ÿฝ".to_string()));
assert_eq!(char_iterator.next_cluster(), Some("๐Ÿ‡ฆ๐Ÿ‡น".to_string()));
assert_eq!(char_iterator.next_cluster(), Some("!".to_string()));
assert_eq!(char_iterator.next_cluster(), None);

For the iterating over clusters case there is a struct Graphemes which implements iterator and can be constructed from a &str. This returns references to substrings of the original &str and is more performant for that case than the extended iterator provided through GraphemeCluster which allocates a new String for each cluster found.

let graphemes = Graphemes::new("A\u{301}โœ‹๐Ÿฝ๐Ÿ‡ฆ๐Ÿ‡น!");
assert_eq!(graphemes.collect::<Vec<&str>>(), ["A\u{301}", "โœ‹๐Ÿฝ", "๐Ÿ‡ฆ๐Ÿ‡น", "!"])

Structsยง

Graphemes
Graphemes provides an iterator over the grapheme clusters of a string.

Traitsยง

GraphemeCluster
Get the next grapheme cluster from a stream of characters or char indices This trait is implemented for any Peekable iterator over either char or (usize, char) (so it will work on Peekable<Chars> and Peekable<CharIndices> as well as any other peekable iterator which meets this requirement.
PeekChar
This trait exists primarily to allow a single implementation to be used for both Peekable<Chars> and Peekable<CharIndices>. You could implement this for some other iterator if you like as long as you can implement the two methods below.