Crate hyphenation [−] [src]
Text hyphenation in a variety of languages.
Usage
A typical import comprises the Hyphenation
trait, the Standard
hyphenator, and the Language
enum. This exposes the crate's core
functionality, and the set of available languages.
extern crate hyphenation; use hyphenation::{Hyphenation, Standard, Language};
To begin with, we must initiate the Corpus
for our working language.
let english_us = hyphenation::load(Language::English_US).unwrap();
Our English Corpus
can now be used by hyphenators: iterators which
segment text in accordance with hyphenation practices, as described
by the corpus.
The simplest (and, presently, only) hyphenator is Standard
:
let h: Standard = "hyphenation".hyphenate(&english_us);
The Standard
hyphenator does not allocate new strings, returning
slices instead.
let v: Vec<&str> = h.collect(); assert_eq!(v, vec!["hy", "phen", "ation"]);
While hyphenation is performed on a per-word basis, convenience calls
for Hyphenation
to work with full text by default.
let h2: Standard = "Word hyphenation by computer.".hyphenate(&english_us); let v2: Vec<&str> = h2.collect(); assert_eq!(v2, vec!["Word hy", "phen", "ation by com", "puter."]);
Moreover, hyphenators expose some simple methods to render hyphenated
text: punctuate()
and punctuate_with(string)
, which mark hyphenation
opportunities respectively with soft hyphens (Unicode U+00AD SOFT HYPHEN
)
and any given string
.
let h3 = "anfractuous".hyphenate(&english_us); let s3: String = h2.clone().punctuate().collect(); assert_eq!(s3, "an\u{ad}frac\u{ad}tu\u{ad}ous".to_owned()); let s4: String = h2.punctuate_with("-").collect() assert_eq!(s4, "an-frac-tu-ous".to_owned());
If we would rather manipulate our text in other ways, we may employ
opportunities()
, which returns the byte indices of hyphenation opportunities
within the string. (Internally, opportunities()
is the fundamental method
required by Hyphenation
; other functionality is implemented on top of it.)
let indices = "hyphenation".opportunities(&english_us); assert_eq!(indices, vec![2, 6]);
Reexports
pub use hyphenator::{Hyphenation, Standard}; |
pub use language::{Language, Corpus}; |
pub use load::{language as load}; |
Modules
exception |
Data structures and methods for parsing and applying exceptions, which assign predetermined scores to specific words. |
hyphenator |
Hyphenating iterators. |
language |
Languages we can hyphenate and their default parameters, as provided by
the TeX |
load |
IO operations for pattern and exception data provided by |
pattern |
Data structures and methods for parsing and applying Knuth-Liang hyphenation patterns. |