1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
// Forsaken docs justly quibble the vexed programmer's waning zeal
//! Text hyphenation in a variety of languages.
//!
//!
//! ## Usage
//!
//! A typical import comprises the `Hyphenation` trait, the `Standard`
//! hyphenator, and the `Language` enum. This exposes the crate's core
//! functionality, and the set of available languages.
//!
//! ```ignore
//! extern crate hyphenation;
//
//! use hyphenation::{Hyphenation, Standard, Language};
//! ```
//!
//! To begin with, we must initiate the `Corpus` for our working language.
//!
//! ```ignore
//! let english_us = hyphenation::load(Language::English_US).unwrap();
//! ```
//!
//! Our English `Corpus` can now be used by `Hyphenation` methods.
//! Core functionality is provided by `opportunities()`, which returns the
//! byte indices of valid hyphenation points within a word.
//!
//! ```ignore
//! let indices = "hyphenation".opportunities(&english_us);
//! assert_eq!(indices, vec![2, 6]);
//! ```
//!
//! The same `Corpus` may also be used by *hyphenators*: iterators which
//! segment words in accordance with hyphenation practices, as described
//! by the corpus.
//!
//! The simplest (and, presently, only) hyphenator is `Standard`:
//!
//! ```ignore
//! let h: Standard = "hyphenation".hyphenate(&english_us);
//! ```
//!
//! The `Standard` hyphenator does not allocate new strings, returning
//! slices instead.
//!
//! ```ignore
//! let v: Vec<&str> = h.collect();
//! assert_eq!(v, vec!["hy", "phen", "ation"]);
//! ```
//!
//! An hyphenator always knows its exact length, which means that we can
//! retrieve the number of remaining word segments with `.len()`.
//!
//! ```ignore
//! let mut iter = "hyphenation".hyphenate(&english_us);
//! assert_eq!(iter.len(), 3);
//! iter.next();
//! assert_eq!(iter.len(), 2);
//! ```
//!
//!
//! ## Full-text Hyphenation
//!
//! While hyphenation is always performed on a per-word basis, convenience
//! calls for a subtrait to provide methods to work with full text.
//!
//! ```ignore
//! use hyphenation::{FullTextHyphenation};
//!
//! let h2: Standard = "Word hyphenation by computer.".fulltext_hyphenate(&english_us);
//! let v2: Vec<&str> = h2.collect();
//! assert_eq!(v2, vec!["Word hy", "phen", "ation by com", "puter."]);
//! ```
//!
//! Hyphenators also expose some simple methods to render hyphenated text:
//! `punctuate()` and `punctuate_with(string)`, which mark hyphenation
//! opportunities respectively with soft hyphens (Unicode `U+00AD SOFT HYPHEN`)
//! and any given `string`.
//!
//! ```ignore
//! let h3 = "anfractuous".hyphenate(&english_us);
//! let s3: String = h2.clone().punctuate().collect();
//! assert_eq!(s3, "an\u{ad}frac\u{ad}tu\u{ad}ous".to_owned());
//!
//! let s4: String = h2.punctuate_with("‧").collect()
//! assert_eq!(s4, "an‧frac‧tu‧ous".to_owned());
//! ```

extern crate bincode;
extern crate fnv;
extern crate hyphenation_commons;
extern crate unicode_segmentation;

mod resources;
mod utilia;
pub mod hyphenator;
pub mod language;
pub mod load;

pub use hyphenation_commons::{KLPair, KLPTrie, Exceptions, Patterns};
pub use hyphenator::{Hyphenation, FullTextHyphenation, Standard};
pub use language::{Language, Corpus};

// Note: the name "load" is misleading, as we are merely accessing embedded tries.
// However, future versions of `hyphenation` should support both embedding
// and runtime loading of pattern data, with loading being the default;
// anticipating such changes, we keep `load` as a public export.
pub use load::{language as load};