Crate hyphenation [−] [src]
Text hyphenation in a variety of languages.
Usage
A typical import comprises the Hyphenation
trait, the Standard
hyphenator, and the Language
enum. This exposes the crate's core
functionality, and the set of available languages.
extern crate hyphenation; use hyphenation::{Hyphenation, Standard, Language};
To begin with, we must initiate the Corpus
for our working language.
let english_us = hyphenation::load(Language::English_US).unwrap();
Our English Corpus
can now be used by Hyphenation
methods.
Core functionality is provided by opportunities()
, which returns the
byte indices of valid hyphenation points within a word.
let indices = "hyphenation".opportunities(&english_us); assert_eq!(indices, vec![2, 6]);
The same Corpus
may also be used by hyphenators: iterators which
segment words in accordance with hyphenation practices, as described
by the corpus.
The simplest (and, presently, only) hyphenator is Standard
:
let h: Standard = "hyphenation".hyphenate(&english_us);
The Standard
hyphenator does not allocate new strings, returning
slices instead.
let v: Vec<&str> = h.collect(); assert_eq!(v, vec!["hy", "phen", "ation"]);
An hyphenator always knows its exact length, which means that we can
retrieve the number of remaining word segments with .len()
.
let mut iter = "hyphenation".hyphenate(&english_us); assert_eq!(iter.len(), 3); iter.next(); assert_eq!(iter.len(), 2);
Full-text Hyphenation
While hyphenation is always performed on a per-word basis, convenience calls for a subtrait to provide methods to work with full text.
use hyphenation::{FullTextHyphenation}; let h2: Standard = "Word hyphenation by computer.".fulltext_hyphenate(&english_us); let v2: Vec<&str> = h2.collect(); assert_eq!(v2, vec!["Word hy", "phen", "ation by com", "puter."]);
Hyphenators also expose some simple methods to render hyphenated text:
punctuate()
and punctuate_with(string)
, which mark hyphenation
opportunities respectively with soft hyphens (Unicode U+00AD SOFT HYPHEN
)
and any given string
.
let h3 = "anfractuous".hyphenate(&english_us); let s3: String = h2.clone().punctuate().collect(); assert_eq!(s3, "an\u{ad}frac\u{ad}tu\u{ad}ous".to_owned()); let s4: String = h2.punctuate_with("‧").collect() assert_eq!(s4, "an‧frac‧tu‧ous".to_owned());
Reexports
pub use hyphenator::Hyphenation; |
pub use hyphenator::FullTextHyphenation; |
pub use hyphenator::Standard; |
pub use language::Language; |
pub use language::Corpus; |
pub use load::language as load; |
Modules
hyphenator |
Hyphenating iterators. |
language |
Languages we can hyphenate and their default parameters, as provided by
the TeX |
load |
IO operations for pattern and exception data provided by |
Structs
Exceptions |
A specialized hash map of pattern-score pairs. |
Patterns |
A basic trie, used to associate patterns to their hyphenation scores. |
Traits
KLPTrie |
Type Definitions
KLPair |
A pair representing a Knuth-Liang hyphenation pattern. It comprises alphabetical characters for subword matching and the score of each hyphenation point. |