Crate hyphenation [] [src]

Text hyphenation in a variety of languages.


A typical import comprises the Hyphenation trait, the Standard hyphenator, and the Language enum. This exposes the crate's core functionality, and the set of available languages.

extern crate hyphenation;
use hyphenation::{Hyphenation, Standard, Language};

To begin with, we must initiate the Corpus for our working language.

let english_us = hyphenation::load(Language::English_US).unwrap();

Our English Corpus can now be used by Hyphenation methods. Core functionality is provided by opportunities(), which returns the byte indices of valid hyphenation points within a word.

let indices = "hyphenation".opportunities(&english_us);
assert_eq!(indices, vec![2, 6]);

The same Corpus may also be used by hyphenators: iterators which segment words in accordance with hyphenation practices, as described by the corpus.

The simplest (and, presently, only) hyphenator is Standard:

let h: Standard = "hyphenation".hyphenate(&english_us);

The Standard hyphenator does not allocate new strings, returning slices instead.

let v: Vec<&str> = h.collect();
assert_eq!(v, vec!["hy", "phen", "ation"]);

An hyphenator always knows its exact length, which means that we can retrieve the number of remaining word segments with .len().

let mut iter = "hyphenation".hyphenate(&english_us);
assert_eq!(iter.len(), 3);;
assert_eq!(iter.len(), 2);

Full-text Hyphenation

While hyphenation is always performed on a per-word basis, convenience calls for a subtrait to provide methods to work with full text.

use hyphenation::{FullTextHyphenation};

let h2: Standard = "Word hyphenation by computer.".fulltext_hyphenate(&english_us);
let v2: Vec<&str> = h2.collect();
assert_eq!(v2, vec!["Word hy", "phen", "ation by com", "puter."]);

Hyphenators also expose some simple methods to render hyphenated text: punctuate() and punctuate_with(string), which mark hyphenation opportunities respectively with soft hyphens (Unicode U+00AD SOFT HYPHEN) and any given string.

let h3 = "anfractuous".hyphenate(&english_us);
let s3: String = h2.clone().punctuate().collect();
assert_eq!(s3, "an\u{ad}frac\u{ad}tu\u{ad}ous".to_owned());

let s4: String = h2.punctuate_with("‧").collect()
assert_eq!(s4, "an‧frac‧tu‧ous".to_owned());


pub use hyphenator::Hyphenation;
pub use hyphenator::FullTextHyphenation;
pub use hyphenator::Standard;
pub use language::Language;
pub use language::Corpus;
pub use load::language as load;



Hyphenating iterators.


Languages we can hyphenate and their default parameters, as provided by the TeX hyph-utf8 package.


IO operations for pattern and exception data provided by hyph-UTF8 and stored in the patterns folder.



A specialized hash map of pattern-score pairs.


A basic trie, used to associate patterns to their hyphenation scores.



Type Definitions


A pair representing a Knuth-Liang hyphenation pattern. It comprises alphabetical characters for subword matching and the score of each hyphenation point.