UTFStringExtensions

Trait UTFStringExtensions 

Source
pub trait UTFStringExtensions {
    // Required methods
    fn count_graphemes(&self) -> usize;
    fn get_grapheme(&self, index: usize) -> &str;
    fn get_graphemes(&self) -> Vec<&str>;
    fn get_grapheme_chunk(&self, offset: usize) -> Vec<&str>;

    // Provided methods
    fn take_grapheme<'a>(
        &self,
        graphemes: &Vec<&'a str>,
        index: usize,
    ) -> RUMString { ... }
    fn get_grapheme_window(
        &self,
        min: usize,
        max: usize,
        offset: usize,
    ) -> RUMString { ... }
    fn get_grapheme_string(&self, end_pattern: &str, offset: usize) -> RUMString { ... }
    fn find_grapheme(&self, pattern: &str, offset: usize) -> &str { ... }
    fn truncate(&self, max_size: usize) -> RUMString { ... }
}
Expand description

Implemented indexing trait for String and str which uses the UnicodeSegmentation facilities to enable grapheme iteration by default. There could be some performance penalty, but it will allow for native Unicode support to the best extent possible.

We also enable decoding from Encoding Standard encodings to UTF-8.

Required Methods§

Source

fn count_graphemes(&self) -> usize

Source

fn get_grapheme(&self, index: usize) -> &str

Return a grapheme unit which could span multiple Unicode codepoints or “characters”.

§Note
    If the grapheme requested does not exists, this method will return a blank string.

Instead of just retrieving a codepoint as character, I decided to take it a step further and have support for grapheme selection such that characters in written language like sanskrit can be properly selected and evaluated.

[!CAUTION] This can be an extremely slow operation over large strings since each call to this method will need to rescan the input string every time we need to look up a grapheme. Unfortunately, this is a side effect of convenience. To improve performance, call .get_graphemes() once and then call take_grapheme() over that iterator.

Source

fn get_graphemes(&self) -> Vec<&str>

Source

fn get_grapheme_chunk(&self, offset: usize) -> Vec<&str>

Provided Methods§

Source

fn take_grapheme<'a>(&self, graphemes: &Vec<&'a str>, index: usize) -> RUMString

Source

fn get_grapheme_window( &self, min: usize, max: usize, offset: usize, ) -> RUMString

Source

fn get_grapheme_string(&self, end_pattern: &str, offset: usize) -> RUMString

Source

fn find_grapheme(&self, pattern: &str, offset: usize) -> &str

Source

fn truncate(&self, max_size: usize) -> RUMString

Implementations on Foreign Types§

Source§

impl UTFStringExtensions for str

Source§

fn count_graphemes(&self) -> usize

Source§

fn get_grapheme(&self, index: usize) -> &str

Source§

fn get_graphemes(&self) -> Vec<&str>

Source§

fn get_grapheme_chunk(&self, offset: usize) -> Vec<&str>

Implementors§