hb_subset/
lib.rs

1//! This crate exposes [HarfBuzz](https://github.com/harfbuzz/harfbuzz) API for subsetting fonts.
2//!
3//! # What is subsetting?
4//! From HarfBuzz documentation:
5//! > Subsetting reduces the codepoint coverage of font files and removes all data that is no longer needed. A subset
6//! > input describes the desired subset. The input is provided along with a font to the subsetting operation. Output is
7//! > a new font file containing only the data specified in the input.
8//! >
9//! > Currently most outline and bitmap tables are supported: glyf, CFF, CFF2, sbix, COLR, and CBDT/CBLC. This also
10//! > includes fonts with variable outlines via OpenType variations. Notably EBDT/EBLC and SVG are not supported. Layout
11//! > subsetting is supported only for OpenType Layout tables (GSUB, GPOS, GDEF). Notably subsetting of graphite or AAT
12//! > tables is not yet supported.
13//! >
14//! > Fonts with graphite or AAT tables may still be subsetted but will likely need to use the retain glyph ids option
15//! > and configure the subset to pass through the layout tables untouched.
16//!
17//! In other words, subsetting allows you to take a large font and construct a new, smaller font which has only those
18//! characters that you need. Be sure to check the license of the font though, as not all fonts can be legally
19//! subsetted.
20//! 
21//! # Why?
22//! Many modern fonts can contain hundreds or even thousands of glyphs, of which only a couple dozen or maybe hundred is
23//! needed in any single document. This also means that modern fonts can be very bulky compared to what is actually
24//! needed. The solution to this is font subsetting: We can construct a font that includes only those glyphs and
25//! features that are needed for the document.
26//! 
27//! # Usage
28//! The simplest way to construct a subset of a font is to use [`subset()`] function. In the following example, we keep
29//! only glyphs that are needed show any combination of characters 'a', 'b' and 'c', e.g. "abc" and "cabba" can be
30//! rendered, but "foobar" cannot:
31//! ```
32//! # use std::fs;
33//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
34//! let font = fs::read("tests/fonts/NotoSans.ttf")?;
35//! let subset_font = hb_subset::subset(&font, "abc".chars())?;
36//! fs::write("tests/fonts/subset.ttf", subset_font)?;
37//! # Ok(())
38//! # }
39//! ```
40//!
41//! To get more control over how the font is subset and what gets included, you can use the lower level API directly:
42//! ```
43//! # use hb_subset::*;
44//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
45//! // Load font directly from a file
46//! let font = Blob::from_file("tests/fonts/NotoSans.ttf")?;
47//! let font = FontFace::new(font)?;
48//!
49//! // Construct a subset manually and include only some of the letters
50//! let mut subset = SubsetInput::new()?;
51//! subset.unicode_set().insert('f');
52//! subset.unicode_set().insert('i');
53//!
54//! // Subset the font using just-constructed subset input
55//! let new_font = subset.subset_font(&font)?;
56//!
57//! // Extract the raw font and write to an output file
58//! std::fs::write("tests/fonts/subset.ttf", &*new_font.underlying_blob())?;
59//! # Ok(())
60//! # }
61//! ```
62//! 
63//! # Using bundled version of HarfBuzz
64//! By default, this crate uses the system HarfBuzz installation. If it is not available, or it is too old, this crate
65//! can also used a bundled copy of HarfBuzz by using feature `bundled`:
66//! ```bash
67//! cargo add hb-subset --features bundled
68//! ```
69
70#![warn(missing_docs)]
71
72mod blob;
73mod common;
74mod error;
75mod font_face;
76mod map;
77mod set;
78mod subset;
79
80pub mod sys;
81
82pub use blob::*;
83pub use common::*;
84pub use error::*;
85pub use font_face::*;
86pub use map::*;
87pub use set::*;
88pub use subset::*;
89
90/// A convenient method to create a subset of a font over given characters.
91///
92/// The returned font can be used everywhere where the original font was used, as long as the string contains only
93/// characters from the given set. In particular, the font includes all relevant ligatures.
94pub fn subset(
95    font: &[u8],
96    characters: impl IntoIterator<Item = char>,
97) -> Result<Vec<u8>, SubsettingError> {
98    // Add all characters to subset, and nothing more.
99    let mut subset = SubsetInput::new().map_err(|_| SubsettingError)?;
100    let mut unicode_set = subset.unicode_set();
101    for char in characters {
102        unicode_set.insert(char);
103    }
104
105    // Load the original font, and then construct a subset from it
106    let font = FontFace::new(Blob::from_bytes(font).map_err(|_| SubsettingError)?)
107        .map_err(|_| SubsettingError)?;
108    let new_font = subset.subset_font(&font)?;
109    let new_font = new_font.underlying_blob().to_vec();
110    Ok(new_font)
111}
112
113#[cfg(test)]
114mod tests {
115    /// Path for Noto Sans font.
116    pub(crate) const NOTO_SANS: &str = "tests/fonts/NotoSans.ttf";
117    /// Path for variable version of Noto Sans font.
118    pub(crate) const NOTO_SANS_VARIABLE: &str = "tests/fonts/NotoSans-Variable.ttf";
119}