tengwar/
lib.rs

1//! Library for conversion of Latin UTF-8 text into Tengwar, using the unicode
2//!     codepoints of the [Free Tengwar Font Project]. Specifically, but not
3//!     exclusively, designed with [Tengwar Telcontar] in mind, for the purpose
4//!     of use within LaTeX macros.
5//!
6//! [Free Tengwar Font Project]: http://freetengwar.sourceforge.net/mapping.html
7//! [Tengwar Telcontar]: http://freetengwar.sourceforge.net/tengtelc.html
8//!
9//! # Overview
10//!
11//! The library is split into two main modules. The [`characters`] module is
12//!     primarily concerned with defining the data and datastructures needed to
13//!     represent Tengwar. The [`mode`] module, on the other hand, is mainly
14//!     concerned with transcription, defining the [`TengwarMode`] trait for the
15//!     rules and the [`Tokenizer`](mode::Tokenizer) type for applying them.
16//!
17//! However, this first level of transcription is usually not enough; Therefore,
18//!     the top level of the crate defines the [`TokenIter`] type to perform
19//!     additional transformations. This higher-level iterator can be configured
20//!     at runtime, and is capable of looking ahead and behind to determine the
21//!     context, enabling critical situational behaviors.
22//!
23//! Three modes are currently provided by default: [`Quenya`] ("Classical"),
24//!     [`Beleriand`], and [`Gondor`]. Each mode implements the [`TengwarMode`]
25//!     trait.
26//!
27//! # Examples
28//!
29//! [`collect`]: Iterator::collect
30//!
31//! ## `TengwarMode` trait
32//!
33//! The most direct way to convert text is [`TengwarMode::transcribe`]. This
34//!     function accepts any input type that implements `AsRef<str>`, and can
35//!     return any type that implements `FromIterator<Token>`; This includes
36//!     `Vec<Token>` and [`String`].
37//! ```
38//! use tengwar::{Quenya, TengwarMode};
39//!
40//! let text: String = Quenya::transcribe("namárië !");
41//! assert_eq!(text, " ");
42//! ```
43//!
44//! ## `ToTengwar` trait
45//!
46//! With the use of the [`ToTengwar`] helper trait (automatically implemented
47//!     for any type implementing `AsRef<str>`), three methods are provided on
48//!     the input type directly. The first is [`ToTengwar::transcriber`], which
49//!     constructs a [`Transcriber`] for the text, allowing iteration over
50//!     [`Token`]s.
51//!
52//! The `Transcriber` also has [`TranscriberSettings`], holding several public
53//!     fields, which can be changed to adjust various aspects of its behavior.
54//! ```
55//! use tengwar::{Quenya, ToTengwar};
56//!
57//! let mut transcriber = "namárië !".transcriber::<Quenya>();
58//! transcriber.settings.alt_a = true; // Use the alternate form of the A-tehta.
59//!
60//! let text: String = transcriber.collect();
61//! assert_eq!(text, " ");
62//! ```
63//!
64//! The second method is [`ToTengwar::to_tengwar`]. This is mostly a convenience
65//!     method, which simply calls [`ToTengwar::transcriber`] and immediately
66//!     [`collect`]s the Iterator into a [`String`].
67//! ```
68//! use tengwar::{Quenya, ToTengwar};
69//!
70//! let text: String = "namárië !".to_tengwar::<Quenya>();
71//! assert_eq!(text, " ");
72//! ```
73//!
74//! The third method is [`ToTengwar::to_tengwar_with`], which does the same, but
75//!     takes [`TranscriberSettings`] to modify the [`Transcriber`] before it is
76//!     collected. This allows settings to be specified once and reused.
77//! ```
78//! use tengwar::{Quenya, ToTengwar, TranscriberSettings};
79//!
80//! let mut settings = TranscriberSettings::new();
81//! settings.alt_a = true;
82//! settings.nuquerna = true;
83//!
84//! let text: String = "namárië !".to_tengwar_with::<Quenya>(settings);
85//! assert_eq!(text, " ");
86//!
87//! let text: String = "lotsë súva".to_tengwar_with::<Quenya>(settings);
88//! assert_eq!(text, " ");
89//! ```
90//!
91//! ## Crate-level function
92//!
93//! Also available, and likely the easiest to discover via code completion, is
94//!     the top-level [`transcribe`] function, which takes an implementor of
95//!     [`TengwarMode`] as a generic parameter. This function accepts any input
96//!     type that implements [`ToTengwar`], and is a passthrough to the
97//!     [`ToTengwar::to_tengwar`] method.
98//! ```
99//! use tengwar::{Quenya, transcribe};
100//!
101//! let text: String = transcribe::<Quenya>("namárië !");
102//! assert_eq!(text, " ");
103//! ```
104//!
105//! ---
106//! # In Detail
107//!
108//! The core of this library is the [`Token`] enum. A `Token` may hold a simple
109//!     [`char`], a [`Glyph`], or a [`Numeral`]. An iterator of `Token`s can be
110//!     [`collect`]ed into a [`String`]; This is where the rendering of Tengwar
111//!     text truly takes place.
112//!
113//! The rest of the library is geared around the creation of `Tokens`, usually
114//!     by iteration, and modifying them before the final call to `collect`.
115//!
116//! ## Mode
117//!
118//! A "Mode" of the Tengwar is essentially an orthography mapping; It correlates
119//!     conventions of writing in a primary world alphabet to the conventions of
120//!     writing in the Tengwar.
121//!
122//! For this purpose, the [`TengwarMode`] trait is provided. A type implementing
123//!     this trait is expected to perform essentially as a state machine, taking
124//!     input in the form of slices of `char`s, and using them to progressively
125//!     construct `Token`s.
126//!
127//! ## Tokenizer
128//!
129//! The first level of iteration is the [`Tokenizer`](mode::Tokenizer). This
130//!     iterator takes UTF-8 text, breaks it down into a [`Vec`] of normalized
131//!     Unicode codepoints, and assembles [`Token`]s according to the rules
132//!     specified by an implementation of [`TengwarMode`].
133//!
134//! Short slices of `char`s are passed to the Mode type, which determines
135//!     whether to accept them as part of a `Token`. If the `char`s are not
136//!     accepted, the slice is narrowed and tried again, until the width reaches
137//!     zero; At this point, the Mode type is shown the full remaining data and
138//!     asked whether it can get anything at all from it. If it cannot, a `char`
139//!     is returned unchanged as a `Token`.
140//!
141//! When the `Tokenizer` yields a `Token`, the following one is generated. This
142//!     allows for one last call to the Mode type, to [`TengwarMode::finalize`],
143//!     to modify a `Token` in light of the one that follows it; This is a very
144//!     important step, as some modes require that different base characters are
145//!     used depending on what follows them.
146//!
147//! ## TokenIter / Transcriber
148//!
149//! The second level of iteration is the [`TokenIter`]. This iterator can wrap
150//!     any other iterator that produces [`Token`]s, and its purpose is to apply
151//!     contextual rules and transformations specified at runtime. This is what
152//!     allows the executable transcriber to take CLI options that change rules,
153//!     such as the treatment of "long" tehta variants.
154//!
155//! A `TokenIter` that wraps a [`Tokenizer`](mode::Tokenizer) can also be called
156//!     a [`Transcriber`] for simplicity, because it is known that its `Token`s
157//!     are being produced directly from text.
158//!
159//! ## Policy
160//!
161//! A "Policy" is similar to a Mode, but rather than defining details about
162//!     **orthography**, it instead defines details about **typography**. This
163//!     includes details such as valid ligatures and placements of *Sa-Rinci*.
164//!
165//! The [`Policy`](policy::Policy) trait is provided for this purpose, and is
166//!     used as a generic parameter for the [`Glyph`] type. Because of this, it
167//!     is also a generic parameter for the [`Token`] and [`TokenIter`] types;
168//!     The [`Tokenizer`](mode::Tokenizer) type is considered to be out of scope
169//!     of the Policy system, and simply yields all of its `Token`s with the
170//!     default policy ([`policy::Standard`]).
171
172#[macro_use]
173extern crate cfg_if;
174#[macro_use]
175extern crate clap;
176#[macro_use]
177#[cfg(feature = "serde")]
178extern crate serde;
179
180// mod macros;
181
182pub mod characters;
183pub mod mode;
184pub mod policy;
185
186mod iter;
187mod token;
188
189pub use characters::{Glyph, Numeral, VowelStyle};
190pub use iter::{TokenIter, Transcriber, TranscriberSettings};
191pub use mode::{Beleriand, Gondor, Quenya, TengwarMode};
192pub use token::Token;
193
194
195/// Convert a compatible object (typically text) into the Tengwar.
196///
197/// This function merely calls a Trait method, but is likely the most readily
198///     discoverable part of the library when using code completion tools.
199pub fn transcribe<M: TengwarMode + Default>(text: impl ToTengwar) -> String {
200    text.to_tengwar::<M>()
201}
202
203
204/// A very small trait serving to implement ergonomic transcription methods
205///     directly onto text objects.
206pub trait ToTengwar {
207    /// Create a [`Transcriber`] to iteratively transcribe this text into the
208    ///     Tengwar. The returned iterator will yield [`Token`]s.
209    ///
210    /// # Example
211    /// ```
212    /// use tengwar::{Quenya, ToTengwar, VowelStyle};
213    ///
214    /// const INPUT: &str = "lotsë súva"; // "a flower is sinking"
215    ///
216    ///
217    /// //  Collect directly with default settings.
218    /// let mut ts = INPUT.transcriber::<Quenya>();
219    /// assert_eq!(ts.into_string(), " ");
220    ///
221    ///
222    /// //  Use Unique Tehtar.
223    /// let mut ts = INPUT.transcriber::<Quenya>();
224    /// ts.settings.vowels = VowelStyle::Unique;
225    /// assert_eq!(ts.into_string(), " ");
226    ///
227    ///
228    /// //  Use Nuquernë Tengwar.
229    /// let mut ts = INPUT.transcriber::<Quenya>();
230    /// ts.settings.nuquerna = true;
231    /// assert_eq!(ts.into_string(), " ");
232    ///
233    ///
234    /// //  Use Unique Tehtar and Nuquernë Tengwar.
235    /// let mut ts = INPUT.transcriber::<Quenya>();
236    /// ts.settings.nuquerna = true;
237    /// ts.settings.vowels = VowelStyle::Unique;
238    /// assert_eq!(ts.into_string(), " ");
239    ///
240    ///
241    /// //  Use several options.
242    /// let mut ts = INPUT.transcriber::<Quenya>();
243    /// ts.settings.alt_a = true;
244    /// ts.settings.alt_rince = true;
245    /// ts.settings.nuquerna = true;
246    /// ts.settings.vowels = VowelStyle::Separate;
247    /// assert_eq!(ts.into_string(), " ");
248    /// ```
249    fn transcriber<M: TengwarMode + Default>(&self) -> Transcriber<M>;
250
251    /// Transcribe this object into the Tengwar directly.
252    ///
253    /// # Example
254    /// ```
255    /// use tengwar::{Quenya, ToTengwar};
256    ///
257    /// let text: String = "namárië !".to_tengwar::<Quenya>();
258    /// assert_eq!(text, " ");
259    /// ```
260    fn to_tengwar<M: TengwarMode + Default>(&self) -> String {
261        self.transcriber::<M>().into_string()
262    }
263
264    /// Transcribe this object into the Tengwar, using [`TranscriberSettings`]
265    ///     provided as an argument. This allows the settings to be reused much
266    ///     more easily.
267    ///
268    /// For examples of the available settings, see the documentation of
269    ///     [`Self::transcriber`].
270    ///
271    /// # Example
272    /// ```
273    /// use tengwar::{Quenya, ToTengwar, TranscriberSettings};
274    ///
275    /// let mut settings = TranscriberSettings::new();
276    /// settings.alt_a = true;
277    /// settings.nuquerna = true;
278    ///
279    /// let text: String = "namárië !".to_tengwar_with::<Quenya>(settings);
280    /// assert_eq!(text, " ");
281    ///
282    /// let text: String = "lotsë súva".to_tengwar_with::<Quenya>(settings);
283    /// assert_eq!(text, " ");
284    /// ```
285    fn to_tengwar_with<M>(&self, settings: TranscriberSettings) -> String
286        where M: TengwarMode + Default
287    {
288        self.transcriber::<M>().with_settings(settings).into_string()
289    }
290}
291
292impl<S: AsRef<str>> ToTengwar for S {
293    fn transcriber<M: TengwarMode + Default>(&self) -> Transcriber<M> {
294        mode::Tokenizer::from_str(self).into_transcriber()
295    }
296}