[−][src]Crate rust_icu_ubrk
ICU text boundary analysis support for Rust
This crate provides a Rust implementation of the ICU text boundary analysis APIs
in ubrk.h
. Character (grapheme cluster), word, line-break, and sentence iterators
are available.
Examples
Sample code use is given below.
use rust_icu_sys as sys; use rust_icu_ubrk as ubrk; let text = "The lazy dog jumped over the fox."; let mut iter = ubrk::UBreakIterator::try_new(sys::UBreakIteratorType::UBRK_WORD, "en", text) .unwrap(); assert!(iter.is_boundary(0)); assert_eq!(0, iter.first()); assert_eq!(None, iter.previous()); assert_eq!(0, iter.current()); let text_len = text.len() as i32; assert!(iter.is_boundary(text_len)); assert_eq!(iter.last_boundary(), text_len); assert_eq!(None, iter.next()); assert_eq!(iter.current(), text_len); let word_start = text.find("jumped").unwrap() as i32; let word_end = word_start + 6; assert!(iter.is_boundary(word_start)); assert!(iter.is_boundary(word_end)); assert!(!iter.is_boundary(word_start + 3)); assert_eq!(word_end, iter.following(word_start + 3)); assert_eq!(word_end, iter.current()); assert_eq!(Some(word_start), iter.previous()); assert_eq!(word_start, iter.current()); assert_eq!(Some(word_end), iter.next()); assert_eq!(word_end, iter.current()); assert_eq!(word_start, iter.preceding(word_start + 3)); assert_eq!(word_start, iter.current()); // Reset to first boundary and consume `iter`. iter.first(); let boundaries: Vec<i32> = iter.collect(); assert_eq!(vec![3, 4, 8, 9, 12, 13, 19, 20, 24, 25, 28, 29, 32, 33], boundaries);
See the ICU user guide
and the C API documentation in the
ubrk.h
header
for details.
Structs
Locales | Iterator over the locales for which text breaking information is available. |
UBreakIterator | Rust wrapper for the ICU |
Constants
UBRK_DONE | Returned by break iterator to indicate that all text boundaries have been returned. |