pub enum WordSeparator {
    AsciiSpace,
    UnicodeBreakProperties,
    Custom(fn(line: &str) -> Box<dyn Iterator<Item = Word<'_>>>),
}
Expand description

Describes where words occur in a line of text.

The simplest approach is say that words are separated by one or more ASCII spaces (' '). This works for Western languages without emojis. A more complex approach is to use the Unicode line breaking algorithm, which finds break points in non-ASCII text.

The line breaks occur between words, please see WordSplitter for options of how to handle hyphenation of individual words.

Examples

use textwrap::core::Word;
use textwrap::WordSeparator::AsciiSpace;

let words = AsciiSpace.find_words("Hello World!").collect::<Vec<_>>();
assert_eq!(words, vec![Word::from("Hello "), Word::from("World!")]);

Variants

AsciiSpace

Find words by splitting on runs of ' ' characters.

Examples

use textwrap::core::Word;
use textwrap::WordSeparator::AsciiSpace;

let words = AsciiSpace.find_words("Hello   World!").collect::<Vec<_>>();
assert_eq!(words, vec![Word::from("Hello   "),
                       Word::from("World!")]);

UnicodeBreakProperties

Split line into words using Unicode break properties.

This word separator uses the Unicode line breaking algorithm described in Unicode Standard Annex #14 to find legal places to break lines. There is a small difference in that the U+002D (Hyphen-Minus) and U+00AD (Soft Hyphen) don’t create a line break: to allow a line break at a hyphen, use WordSplitter::HyphenSplitter. Soft hyphens are not currently supported.

Examples

Unlike WordSeparator::AsciiSpace, the Unicode line breaking algorithm will find line break opportunities between some characters with no intervening whitespace:

#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;

assert_eq!(UnicodeBreakProperties.find_words("Emojis: 😂😍").collect::<Vec<_>>(),
           vec![Word::from("Emojis: "),
                Word::from("😂"),
                Word::from("😍")]);

assert_eq!(UnicodeBreakProperties.find_words("CJK: 你好").collect::<Vec<_>>(),
           vec![Word::from("CJK: "),
                Word::from("你"),
                Word::from("好")]);
}

A U+2060 (Word Joiner) character can be inserted if you want to manually override the defaults and keep the characters together:

#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;

assert_eq!(UnicodeBreakProperties.find_words("Emojis: 😂\u{2060}😍").collect::<Vec<_>>(),
           vec![Word::from("Emojis: "),
                Word::from("😂\u{2060}😍")]);
}

The Unicode line breaking algorithm will also automatically suppress break breaks around certain punctuation characters::

#[cfg(feature = "unicode-linebreak")] {
use textwrap::core::Word;
use textwrap::WordSeparator::UnicodeBreakProperties;

assert_eq!(UnicodeBreakProperties.find_words("[ foo ] bar !").collect::<Vec<_>>(),
           vec![Word::from("[ foo ] "),
                Word::from("bar !")]);
}

Custom(fn(line: &str) -> Box<dyn Iterator<Item = Word<'_>>>)

Find words using a custom word separator

Implementations

Find all words in line.

Trait Implementations

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

Formats the value using the given formatter. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

🔬 This is a nightly-only experimental API. (toowned_clone_into)

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.