1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
/*! This crate provides a way to access glyphs from TeX fonts. It is intended to be used by
crates using [`tex_engine`](https://crates.io/crates/tex_engine).
TeX deals with fonts by parsing *font metric files* (`.tfm` files), which contain information
about the dimensions of each glyph in the font. So from the point of view of (the core of) TeX,
a *glyph* is just an index $0 \leq i \leq 255$ into the font metric file.
In order to find out what the glyph actually looks like, we want to ideally know the corresponding
unicode codepoint. This crate attempts to do exactly that.
# Usage
This crate attempts to associate a tex font (identified by the file name stem of its `.tfm` file) with:
1. A list of [`FontModifier`](fontstyles::FontModifier)s (e.g. bold, italic, sans-serif, etc.)
2. A [`GlyphList`], being an array `[`[`Glyph`]`;256]`
A [`Glyph`] then is either undefined (i.e. the glyph is not present in the font, or the crate couldn't
figure out what exactly it is) or presentable as a string.
Consider e.g. `\mathbf{\mathit{\Gamma^\kappa_\ell}}` (i.e. $\mathbf{\mathit{\Gamma^\kappa_\ell}}$).
From the point of view of TeX, this is a sequence of 3 glyphs, represented as indices into the font
`cmmib10`, namely 0, 20, and 96.
Here's how to use this crate to obtain the corresponding unicode characters, i.e. `𝜞`, `𝜿` and `ℓ`:
### Instantiation
First, we instantiate a [`FontInfoStore`](encodings::FontInfoStore) with a function that
allows it to find files. This function should take a string (e.g. `cmmib10.tfm`) and return a string
(e.g. `/usr/share/texmf-dist/fonts/tfm/public/cm/cmmib10.tfm`). This could be done by calling `kpsewhich`
for example, but repeated and frequent calls to `kpsewhich` are slow, so more efficient alternatives
are recommended.
```no_run
use tex_glyphs::encodings::FontInfoStore;
let mut store = FontInfoStore::new(|s| {
std::str::from_utf8(std::process::Command::new("kpsewhich")
.args(vec!(s)).output().expect("kpsewhich not found!")
.stdout.as_slice()).unwrap().trim().to_string()
});
```
This store will now use the provided function to find your `pdftex.map` file, which lists
all the fonts that are available to TeX and associates them with `.enc`, `.pfa` and `.pfb` files.
### Obtaining Glyphs
If we now query the store for the [`GlyphList`] of some font, e.g. `cmmib10`, like so:
```no_run
# use tex_glyphs::encodings::FontInfoStore;
# let mut store = FontInfoStore::new(|s| {
# std::str::from_utf8(std::process::Command::new("kpsewhich")
# .args(vec!(s)).output().expect("kpsewhich not found!")
# .stdout.as_slice()).unwrap().trim().to_string()
# });
let ls = store.get_glyphlist("cmmib10");
```
...it will attempt to parse the `.enc` file associated with `cmmib10`, if existent. If not, or if this
fails, it will try to parse the `.pfa` or `.pfb` file. If neither works, it will search for a `.vf` file
and try to parse that. If that too fails, it will return an empty [`GlyphList`].
From either of those three sources, it will then attempt to associate each byte index with a
[`Glyph`]:
```no_run
# use tex_glyphs::encodings::FontInfoStore;
# let mut store = FontInfoStore::new(|s| {
# std::str::from_utf8(std::process::Command::new("kpsewhich")
# .args(vec!(s)).output().expect("kpsewhich not found!")
# .stdout.as_slice()).unwrap().trim().to_string()
# });
# let ls = store.get_glyphlist("cmmib10");
let zero = ls.get(0);
let twenty = ls.get(20);
let ninety_six = ls.get(96);
println!("0={}={}, 20={}={}, and 96={}={}",
zero.name(),zero,
twenty.name(),twenty,
ninety_six.name(),ninety_six
);
```
```text
0=Gamma=Γ, 20=kappa=κ, and 96=lscript=ℓ
```
### Font Modifiers
So far, so good - but the glyphs are not bold or italic, but in `cmmib10`, they are.
So let's check out what properties `cmmib10` has:
```
# use tex_glyphs::encodings::FontInfoStore;
# let mut store = FontInfoStore::new(|s| {
# std::str::from_utf8(std::process::Command::new("kpsewhich")
# .args(vec!(s)).output().expect("kpsewhich not found!")
# .stdout.as_slice()).unwrap().trim().to_string()
# });
let font_info = store.get_info("cmmib10").unwrap();
println!("{:?}",font_info.styles);
println!("{:?}",font_info.weblink);
```
```text
ModifierSeq { blackboard: false, fraktur: false, script: false, bold: true, capitals: false, monospaced: false, italic: true, oblique: false, sans_serif: false }
Some(("Latin Modern Math", "https://fonts.cdnfonts.com/css/latin-modern-math"))
```
...so this tells us that the font is bold and italic, but not sans-serif, monospaced, etc.
Also, it tells us that the publically available web-compatible quivalent
of this font is called "Latin Modern Math" and that we can find it at the provided
URL, if we want to use it in e.g. HTML :)
Now we only need to apply the modifiers to the glyphs:
```
# use tex_glyphs::encodings::FontInfoStore;
# let mut store = FontInfoStore::new(|s| {
# std::str::from_utf8(std::process::Command::new("kpsewhich")
# .args(vec!(s)).output().expect("kpsewhich not found!")
# .stdout.as_slice()).unwrap().trim().to_string()
# });
# let ls = store.get_glyphlist("cmmib10");
# let zero = ls.get(0);
# let twenty = ls.get(20);
# let ninety_six = ls.get(96);
# let font_info = store.get_info("cmmib10").unwrap();
use tex_glyphs::fontstyles::FontModifiable;
println!("{}, {}, and {}",
zero.to_string().apply(font_info.styles),
twenty.to_string().apply(font_info.styles),
ninety_six.to_string().apply(font_info.styles)
);
```
```text
𝜞, 𝜿, and ℓ
```
The [`apply`](fontstyles::FontModifiable::apply)-method stems
from the trait [`FontModifiable`](fontstyles::FontModifiable), which is implemented
for any type that implements `AsRef<str>`, including `&str` and `String`.
It also provides more direct methods, e.g. [`make_bold`](fontstyles::FontModifiable::make_bold),
[`make_italic`](fontstyles::FontModifiable::make_italic), [`make_sans`](fontstyles::FontModifiable::make_sans), etc.
# Fixing Mistakes
The procedure above for determining glyphs and font modifiers is certainly not perfect; not just
because `enc` and `pfa`/`pfb` files might contain wrong or unknown glyph names, but also because
font modifiers are determined heuristically. For that reason, we provide a way to fix mistakes:
1. The map from glyphnames to unicode is stored in the file [glyphmap.txt](https://github.com/Jazzpirate/RusTeX/blob/main/tex-glyphs/src/resources/glyphmap.txt)
2. Font modifiers, web font names and links, or even full glyph lists can be added
to the markdown file [patches.md](https://github.com/Jazzpirate/RusTeX/blob/main/tex-glyphs/src/resources/patches.md),
which additionally serves as a how-to guide for patching any mistakes you might find.
Both files are parsed *during compilation*.
If you notice any mistakes, feel free to open a pull request for these files.
*/
#![allow(text_direction_codepoint_in_literal)]
#![warn(missing_docs)]
pub mod fontstyles;
pub mod encodings;
mod parsing;
pub mod glyphs;
use crate::glyphs::{Glyph,GlyphList};
include!(concat!(env!("OUT_DIR"), "/codegen.rs"));
#[cfg(test)]
mod tests {
use crate::encodings::FontInfoStore;
use super::*;
use super::fontstyles::{FontModifiable, FontModifier};
#[test]
fn test_glyphmap() {
assert_eq!(Glyph::get("AEacute").to_string(), "Ǽ");
assert_eq!(Glyph::get("contourintegral").to_string(), "∮");
assert_eq!(Glyph::get("bulletinverse").to_string(), "◘");
assert_eq!(Glyph::get("Gangiacoptic").to_string(), "Ϫ");
assert_eq!(Glyph::get("zukatakana").to_string(), "ズ");
assert_eq!("test".make_bold().to_string(), "𝐭𝐞𝐬𝐭");
assert_eq!("test".make_bold().make_sans().to_string(), "𝘁𝗲𝘀𝘁");
assert_eq!("test".apply_modifiers(&[FontModifier::SansSerif,FontModifier::Bold]).to_string(), "𝘁𝗲𝘀𝘁");
}
fn get_store() -> FontInfoStore<String,fn(&str) -> String> {
FontInfoStore::new(|s| {
std::str::from_utf8(std::process::Command::new("kpsewhich")
.args(vec!(s)).output().expect("kpsewhich not found!")
.stdout.as_slice()).unwrap().trim().to_string()
})
}
#[test]
fn test_encodings() {
let mut es = get_store();
let names = es.all_encs().take(50).map(|e| e.tfm_name.clone()).collect::<Vec<_>>();
for n in names { es.get_glyphlist(n); }
}
#[test]
fn print_table() {
env_logger::builder().filter_level(log::LevelFilter::Debug).try_init().unwrap();
let mut es = get_store();
log::info!("cmr10:\n{}",es.display_encoding("cmr10").unwrap());
log::info!("cmbx10:\n{}",es.display_encoding("cmbx10").unwrap());
/*
log::info!("ptmr7t:\n{}",es.display_encoding("ptmr7t").unwrap());
log::info!("ecrm1095:\n{}",es.display_encoding("ecrm1095").unwrap());
log::info!("ec-lmr10:\n{}",es.display_encoding("ec-lmr10").unwrap());
log::info!("jkpbitc:\n{}",es.display_encoding("jkpbitc").unwrap());
log::info!("ot1-stix2textsc:\n{}",es.display_encoding("ot1-stix2textsc").unwrap());
log::info!("stix-mathbbit-bold:\n{}",es.display_encoding("stix-mathbbit-bold").unwrap());
log::info!("MnSymbolE10:\n{}",es.display_encoding("MnSymbolE10").unwrap());
*/
}
/*
#[test]
fn vfs() {
env_logger::builder().filter_level(log::LevelFilter::Debug).try_init().unwrap();
use tex_engine::engine::filesystem::kpathsea::*;
let mut store = encodings::EncodingStore::new(|s| {
match KPATHSEA.which(s).map(|s| s.to_str().map(|s| s.to_string())).flatten() {
Some(s) => s,
_ => "".into()
}
});
let vfs = &KPATHSEA.post.clone();
for v in vfs.values() {
match v.extension() {
Some(e) if e == "vf" => {
let name = v.file_stem().unwrap().to_str().unwrap();
log::info!("{}",v.display());
match store.display_encoding(name) {
Some(s) => log::info!("{}",s),
None => log::info!("Failed!")
}
print!("");
}
_ => ()
}
}
}
*/
}