tengwar/lib.rs
1//! Library for conversion of Latin UTF-8 text into Tengwar, using the unicode
2//! codepoints of the [Free Tengwar Font Project]. Specifically, but not
3//! exclusively, designed with [Tengwar Telcontar] in mind, for the purpose
4//! of use within LaTeX macros.
5//!
6//! [Free Tengwar Font Project]: http://freetengwar.sourceforge.net/mapping.html
7//! [Tengwar Telcontar]: http://freetengwar.sourceforge.net/tengtelc.html
8//!
9//! # Overview
10//!
11//! The library is split into two main modules. The [`characters`] module is
12//! primarily concerned with defining the data and datastructures needed to
13//! represent Tengwar. The [`mode`] module, on the other hand, is mainly
14//! concerned with transcription, defining the [`TengwarMode`] trait for the
15//! rules and the [`Tokenizer`](mode::Tokenizer) type for applying them.
16//!
17//! However, this first level of transcription is usually not enough; Therefore,
18//! the top level of the crate defines the [`TokenIter`] type to perform
19//! additional transformations. This higher-level iterator can be configured
20//! at runtime, and is capable of looking ahead and behind to determine the
21//! context, enabling critical situational behaviors.
22//!
23//! Three modes are currently provided by default: [`Quenya`] ("Classical"),
24//! [`Beleriand`], and [`Gondor`]. Each mode implements the [`TengwarMode`]
25//! trait.
26//!
27//! # Examples
28//!
29//! [`collect`]: Iterator::collect
30//!
31//! ## `TengwarMode` trait
32//!
33//! The most direct way to convert text is [`TengwarMode::transcribe`]. This
34//! function accepts any input type that implements `AsRef<str>`, and can
35//! return any type that implements `FromIterator<Token>`; This includes
36//! `Vec<Token>` and [`String`].
37//! ```
38//! use tengwar::{Quenya, TengwarMode};
39//!
40//! let text: String = Quenya::transcribe("namárië !");
41//! assert_eq!(text, " ");
42//! ```
43//!
44//! ## `ToTengwar` trait
45//!
46//! With the use of the [`ToTengwar`] helper trait (automatically implemented
47//! for any type implementing `AsRef<str>`), three methods are provided on
48//! the input type directly. The first is [`ToTengwar::transcriber`], which
49//! constructs a [`Transcriber`] for the text, allowing iteration over
50//! [`Token`]s.
51//!
52//! The `Transcriber` also has [`TranscriberSettings`], holding several public
53//! fields, which can be changed to adjust various aspects of its behavior.
54//! ```
55//! use tengwar::{Quenya, ToTengwar};
56//!
57//! let mut transcriber = "namárië !".transcriber::<Quenya>();
58//! transcriber.settings.alt_a = true; // Use the alternate form of the A-tehta.
59//!
60//! let text: String = transcriber.collect();
61//! assert_eq!(text, " ");
62//! ```
63//!
64//! The second method is [`ToTengwar::to_tengwar`]. This is mostly a convenience
65//! method, which simply calls [`ToTengwar::transcriber`] and immediately
66//! [`collect`]s the Iterator into a [`String`].
67//! ```
68//! use tengwar::{Quenya, ToTengwar};
69//!
70//! let text: String = "namárië !".to_tengwar::<Quenya>();
71//! assert_eq!(text, " ");
72//! ```
73//!
74//! The third method is [`ToTengwar::to_tengwar_with`], which does the same, but
75//! takes [`TranscriberSettings`] to modify the [`Transcriber`] before it is
76//! collected. This allows settings to be specified once and reused.
77//! ```
78//! use tengwar::{Quenya, ToTengwar, TranscriberSettings};
79//!
80//! let mut settings = TranscriberSettings::new();
81//! settings.alt_a = true;
82//! settings.nuquerna = true;
83//!
84//! let text: String = "namárië !".to_tengwar_with::<Quenya>(settings);
85//! assert_eq!(text, " ");
86//!
87//! let text: String = "lotsë súva".to_tengwar_with::<Quenya>(settings);
88//! assert_eq!(text, " ");
89//! ```
90//!
91//! ## Crate-level function
92//!
93//! Also available, and likely the easiest to discover via code completion, is
94//! the top-level [`transcribe`] function, which takes an implementor of
95//! [`TengwarMode`] as a generic parameter. This function accepts any input
96//! type that implements [`ToTengwar`], and is a passthrough to the
97//! [`ToTengwar::to_tengwar`] method.
98//! ```
99//! use tengwar::{Quenya, transcribe};
100//!
101//! let text: String = transcribe::<Quenya>("namárië !");
102//! assert_eq!(text, " ");
103//! ```
104//!
105//! ---
106//! # In Detail
107//!
108//! The core of this library is the [`Token`] enum. A `Token` may hold a simple
109//! [`char`], a [`Glyph`], or a [`Numeral`]. An iterator of `Token`s can be
110//! [`collect`]ed into a [`String`]; This is where the rendering of Tengwar
111//! text truly takes place.
112//!
113//! The rest of the library is geared around the creation of `Tokens`, usually
114//! by iteration, and modifying them before the final call to `collect`.
115//!
116//! ## Mode
117//!
118//! A "Mode" of the Tengwar is essentially an orthography mapping; It correlates
119//! conventions of writing in a primary world alphabet to the conventions of
120//! writing in the Tengwar.
121//!
122//! For this purpose, the [`TengwarMode`] trait is provided. A type implementing
123//! this trait is expected to perform essentially as a state machine, taking
124//! input in the form of slices of `char`s, and using them to progressively
125//! construct `Token`s.
126//!
127//! ## Tokenizer
128//!
129//! The first level of iteration is the [`Tokenizer`](mode::Tokenizer). This
130//! iterator takes UTF-8 text, breaks it down into a [`Vec`] of normalized
131//! Unicode codepoints, and assembles [`Token`]s according to the rules
132//! specified by an implementation of [`TengwarMode`].
133//!
134//! Short slices of `char`s are passed to the Mode type, which determines
135//! whether to accept them as part of a `Token`. If the `char`s are not
136//! accepted, the slice is narrowed and tried again, until the width reaches
137//! zero; At this point, the Mode type is shown the full remaining data and
138//! asked whether it can get anything at all from it. If it cannot, a `char`
139//! is returned unchanged as a `Token`.
140//!
141//! When the `Tokenizer` yields a `Token`, the following one is generated. This
142//! allows for one last call to the Mode type, to [`TengwarMode::finalize`],
143//! to modify a `Token` in light of the one that follows it; This is a very
144//! important step, as some modes require that different base characters are
145//! used depending on what follows them.
146//!
147//! ## TokenIter / Transcriber
148//!
149//! The second level of iteration is the [`TokenIter`]. This iterator can wrap
150//! any other iterator that produces [`Token`]s, and its purpose is to apply
151//! contextual rules and transformations specified at runtime. This is what
152//! allows the executable transcriber to take CLI options that change rules,
153//! such as the treatment of "long" tehta variants.
154//!
155//! A `TokenIter` that wraps a [`Tokenizer`](mode::Tokenizer) can also be called
156//! a [`Transcriber`] for simplicity, because it is known that its `Token`s
157//! are being produced directly from text.
158//!
159//! ## Policy
160//!
161//! A "Policy" is similar to a Mode, but rather than defining details about
162//! **orthography**, it instead defines details about **typography**. This
163//! includes details such as valid ligatures and placements of *Sa-Rinci*.
164//!
165//! The [`Policy`](policy::Policy) trait is provided for this purpose, and is
166//! used as a generic parameter for the [`Glyph`] type. Because of this, it
167//! is also a generic parameter for the [`Token`] and [`TokenIter`] types;
168//! The [`Tokenizer`](mode::Tokenizer) type is considered to be out of scope
169//! of the Policy system, and simply yields all of its `Token`s with the
170//! default policy ([`policy::Standard`]).
171
172#[macro_use]
173extern crate cfg_if;
174#[macro_use]
175extern crate clap;
176#[macro_use]
177#[cfg(feature = "serde")]
178extern crate serde;
179
180// mod macros;
181
182pub mod characters;
183pub mod mode;
184pub mod policy;
185
186mod iter;
187mod token;
188
189pub use characters::{Glyph, Numeral, VowelStyle};
190pub use iter::{TokenIter, Transcriber, TranscriberSettings};
191pub use mode::{Beleriand, Gondor, Quenya, TengwarMode};
192pub use token::Token;
193
194
195/// Convert a compatible object (typically text) into the Tengwar.
196///
197/// This function merely calls a Trait method, but is likely the most readily
198/// discoverable part of the library when using code completion tools.
199pub fn transcribe<M: TengwarMode + Default>(text: impl ToTengwar) -> String {
200 text.to_tengwar::<M>()
201}
202
203
204/// A very small trait serving to implement ergonomic transcription methods
205/// directly onto text objects.
206pub trait ToTengwar {
207 /// Create a [`Transcriber`] to iteratively transcribe this text into the
208 /// Tengwar. The returned iterator will yield [`Token`]s.
209 ///
210 /// # Example
211 /// ```
212 /// use tengwar::{Quenya, ToTengwar, VowelStyle};
213 ///
214 /// const INPUT: &str = "lotsë súva"; // "a flower is sinking"
215 ///
216 ///
217 /// // Collect directly with default settings.
218 /// let mut ts = INPUT.transcriber::<Quenya>();
219 /// assert_eq!(ts.into_string(), " ");
220 ///
221 ///
222 /// // Use Unique Tehtar.
223 /// let mut ts = INPUT.transcriber::<Quenya>();
224 /// ts.settings.vowels = VowelStyle::Unique;
225 /// assert_eq!(ts.into_string(), " ");
226 ///
227 ///
228 /// // Use Nuquernë Tengwar.
229 /// let mut ts = INPUT.transcriber::<Quenya>();
230 /// ts.settings.nuquerna = true;
231 /// assert_eq!(ts.into_string(), " ");
232 ///
233 ///
234 /// // Use Unique Tehtar and Nuquernë Tengwar.
235 /// let mut ts = INPUT.transcriber::<Quenya>();
236 /// ts.settings.nuquerna = true;
237 /// ts.settings.vowels = VowelStyle::Unique;
238 /// assert_eq!(ts.into_string(), " ");
239 ///
240 ///
241 /// // Use several options.
242 /// let mut ts = INPUT.transcriber::<Quenya>();
243 /// ts.settings.alt_a = true;
244 /// ts.settings.alt_rince = true;
245 /// ts.settings.nuquerna = true;
246 /// ts.settings.vowels = VowelStyle::Separate;
247 /// assert_eq!(ts.into_string(), " ");
248 /// ```
249 fn transcriber<M: TengwarMode + Default>(&self) -> Transcriber<M>;
250
251 /// Transcribe this object into the Tengwar directly.
252 ///
253 /// # Example
254 /// ```
255 /// use tengwar::{Quenya, ToTengwar};
256 ///
257 /// let text: String = "namárië !".to_tengwar::<Quenya>();
258 /// assert_eq!(text, " ");
259 /// ```
260 fn to_tengwar<M: TengwarMode + Default>(&self) -> String {
261 self.transcriber::<M>().into_string()
262 }
263
264 /// Transcribe this object into the Tengwar, using [`TranscriberSettings`]
265 /// provided as an argument. This allows the settings to be reused much
266 /// more easily.
267 ///
268 /// For examples of the available settings, see the documentation of
269 /// [`Self::transcriber`].
270 ///
271 /// # Example
272 /// ```
273 /// use tengwar::{Quenya, ToTengwar, TranscriberSettings};
274 ///
275 /// let mut settings = TranscriberSettings::new();
276 /// settings.alt_a = true;
277 /// settings.nuquerna = true;
278 ///
279 /// let text: String = "namárië !".to_tengwar_with::<Quenya>(settings);
280 /// assert_eq!(text, " ");
281 ///
282 /// let text: String = "lotsë súva".to_tengwar_with::<Quenya>(settings);
283 /// assert_eq!(text, " ");
284 /// ```
285 fn to_tengwar_with<M>(&self, settings: TranscriberSettings) -> String
286 where M: TengwarMode + Default
287 {
288 self.transcriber::<M>().with_settings(settings).into_string()
289 }
290}
291
292impl<S: AsRef<str>> ToTengwar for S {
293 fn transcriber<M: TengwarMode + Default>(&self) -> Transcriber<M> {
294 mode::Tokenizer::from_str(self).into_transcriber()
295 }
296}