Crate iftree[][src]

Expand description

Include many files in your Rust code for self-contained binaries.

Test crates.io

Motivation

Self-contained binaries are easy to ship, as they come with any required file data such as game assets, web templates, etc.

The standard library’s std::include_str! includes the contents of a given file. Iftree generalizes this in two ways:

  • Not just one, but many files can be included at once with path patterns in a .gitignore-like format. Patterns are flexible: you can include multiple folders, skip hidden files, filter by filename extension, select a fixed file list, etc.
  • Instead of including the file contents only, files can be associated with any data fields such as additional file metadata.

Conceptually:

std:       include_str!("my_file")
Iftree:    any_macro!("my_files/**")

Refer to related work to see Iftree in the context of other, similar projects.

Introduction

Here is a minimal example that shows the basic functionality.

// Say you have the following files:
//
//     my_assets/
//     ├── file_a
//     ├── file_b
//     └── folder/
//         └── file_c

// To include these files in your code, the macro `iftree::include_file_tree` is
// attached to a custom type like this:
#[iftree::include_file_tree("paths = '/my_assets/**'")]
pub struct MyAsset {
    contents_str: &'static str,
}
// Above we configure a path pattern to filter the files in `my_assets/` and its
// subfolders. For each selected file, an instance of `MyAsset` is initialized.
// The standard field `contents_str` is automatically populated with a call to
// `include_str!`, but you can plug in your own initializer.

fn main() {
    // Based on this, Iftree generates an array `ASSETS` with the desired file
    // data. You can use it like so:
    assert_eq!(ASSETS.len(), 3);
    assert_eq!(ASSETS[0].contents_str, "… contents file_a\n");
    assert_eq!(ASSETS[1].contents_str, "… contents file_b\n");
    assert_eq!(ASSETS[2].contents_str, "… file_c\n");

    // Also, variables `base::x::y::MY_FILE` are generated (named by file path):
    assert_eq!(base::my_assets::FILE_A.contents_str, "… contents file_a\n");
    assert_eq!(base::my_assets::FILE_B.contents_str, "… contents file_b\n");
    assert_eq!(base::my_assets::folder::FILE_C.contents_str, "… file_c\n");
}

Usage

Now that you have a general idea of the library, learn how to integrate it with your project.

Getting started

  1. Add the dependency iftree = "1.0" to your manifest (Cargo.toml).

  2. Define your asset type (MyAsset in the introduction).

    This is a struct with the fields you need per file. Alternatively, it can be a type alias, which may be convenient if you have exactly one field.

  3. Next, filter files to be included by annotating your asset type with #[iftree::include_file_tree("paths = '/my/assets/**'")].

    The macro argument is a TOML string literal. Its paths option here supports .gitignore-like path patterns, with one pattern per line. These paths are relative to the folder with your manifest by default. See the paths configuration for more.

  4. When building your project, code is generated that uses an initializer to instantiate the asset type once per file.

    By default, a field contents_str (if any) is populated with include_str!, a field contents_bytes is populated with include_bytes!, and a couple of other standard fields are recognized. However, you can plug in your own macro to fully customize the initialization by configuring an initializer. For even more control over code generation, there is the concept of visitors.

  5. Now you can access your included file data via ASSETS array or via base::my::assets::MY_FILE variables.

Examples

If you like to explore by example, there is an examples folder. The documentation links to individual examples where helpful.

You could get started with the introductory example. For a more complex case, see the showcase example.

Note that some examples need extra dependencies from the dev-dependencies of the manifest.

Standard fields

When you use a subset of the following fields only, an initializer for your asset type is generated without further configuration. You can still override these field names with a custom initializer.

  • contents_bytes: &'static [u8]

    File contents as a byte array, using std::include_bytes.

  • contents_str: &'static str

    File contents interpreted as a UTF-8 string, using std::include_str.

  • get_bytes: fn() -> std::borrow::Cow<'static, [u8]>

    In debug builds (that is, when debug_assertions is enabled), this function reads the file afresh on each call at runtime. It panics if there is any error such as if the file does not exist. This helps with faster development, as it avoids rebuilding if asset file contents are changed only (note that you still need to rebuild if assets are added, renamed, or removed).

    In release builds, it returns the file contents included at compile time, using std::include_bytes.

  • get_str: fn() -> std::borrow::Cow<'static, str>

    Same as get_bytes but for the file contents interpreted as a UTF-8 string, using std::include_str.

  • relative_path: &'static str

    File path relative to the base folder, which is the folder with your manifest (Cargo.toml) by default. Path components are separated by a slash /, independent of your platform.

See example.

Name sanitization

When generating identifiers based on paths, names are sanitized. For example, a filename 404_not_found.md is sanitized to an identifier _404_NOT_FOUND_MD.

The sanitization process is designed to generate valid Unicode identifiers. Essentially, it replaces invalid identifier characters by underscores "_".

More precisely, these transformations are applied in order:

  1. The case of letters is adjusted to respect naming conventions:
    • All lowercase for folders (because they map to module names).
    • All uppercase for filenames (because they map to static variables).
  2. Characters without the property XID_Continue are replaced by "_". The set of XID_Continue characters in ASCII is [0-9A-Z_a-z].
  3. If the first character does not belong to XID_Start and is not "_", then "_" is prepended. The set of XID_Start characters in ASCII is [A-Za-z].
  4. If the name is "_", "crate", "self", "Self", or "super", then "_" is appended.

Note that non-ASCII identifiers are only supported from Rust 1.53.0. For earlier versions, the sanitization here may generate invalid identifiers if you use non-ASCII paths, in which case you need to manually rename any affected files.

Portable file paths

To prevent issues when developing on different platforms, your file paths should follow these recommendations:

  • Path components are separated by a slash / (even on Windows).
  • Filenames do not contain backslashes \ (even on Unix-like systems).

Troubleshooting

To inspect the generated code, there is a debug configuration.

Recipes

Here are example solutions for given problems.

Kinds of asset types

Integration with other libraries

Including file metadata

Custom constructions

Related work

Originally, I’ve worked on Iftree because I couldn’t find a library for this use case: including files from a folder filtered by filename extension. The project has since developed into something more flexible.

Here is how I think Iftree compares to related projects for the given criteria.

ProjectFile selectionIncluded file dataData access via
include_dir 0.6Single folderPath, contentsFile path, nested iterators, glob patterns
includedir 0.6Multiple files, multiple foldersPath, contentsFile path, iterator
Rust Embed 5.9Single folderPath, contentsFile path, iterator
std::include_bytesSingle fileContentsFile path
std::include_strSingle fileContentsFile path
IftreeMultiple files by inclusion-exclusion path patternsPath, contents, customFile path (via base::x::y::MY_FILE variables in constant time), iterator (ASSETS array), custom

Generally, while Iftree has defaults to address common use cases, it can be customized to support more specific use cases, too (see recipes for examples).

Configuration reference

The iftree::include_file_tree macro is configured via a TOML string with the following fields.

base_folder

Path patterns are interpreted as relative to this folder.

If this path itself is relative, then it is joined to the folder given by the environment variable CARGO_MANIFEST_DIR. That is, a relative path x/y/z has a full path [CARGO_MANIFEST_DIR]/[base_folder]/x/y/z.

Default: ""

See example.

debug

Whether to generate a string variable DEBUG with debug information such as the generated code.

Default: false

See example.

paths

A string with a path pattern per line to filter files.

It works like a .gitignore file with inverted meaning:

  • If the last matching pattern is negated (with !), the file is excluded.
  • If the last matching pattern is not negated, the file is included.
  • If no pattern matches, the file is excluded.

The pattern language is as documented in the .gitignore reference, with this difference: you must use x/y/* instead of x/y/ to include files in a folder x/y/; to also include subfolders (recursively), use x/y/**.

Exclude hidden files with !.* as a pattern. Another common pattern is of the form *.xyz to include files with filename extension xyz only.

By default, path patterns are relative to the environment variable CARGO_MANIFEST_DIR, which is the folder with your manifest (Cargo.toml). See the base_folder configuration to customize this.

This is a required option without default.

See example.

root_folder_variable

The name of the environment variable to use as the root folder for the base_folder configuration.

The value of the environment variable should be an absolute path.

Default: "CARGO_MANIFEST_DIR"

template.identifiers

Whether to generate an identifier per file.

Given a file x/y/my_file, a static variable base::x::y::MY_FILE is generated, nested in modules for folders. Their root module is base, which represents the base folder.

Each variable is a reference to the corresponding element in the ASSETS array.

Generated identifiers are subject to name sanitization. Because of this, there may be collisions in the generated code, causing an error about a name being defined multiple times. The code generation does not try to resolve such collisions automatically, as this would likely cause confusion about which identifier refers to which file. Instead, you need to manually rename any affected paths (assuming you need the generated identifiers at all – otherwise, you can just disable this with template.identifiers = false).

Default: true

See example.

template.initializer

A macro name used to instantiate the asset type per file.

As inputs, the macro is passed the following arguments, separated by comma:

  1. Relative file path as a string literal. Path components are separated by /.
  2. Absolute file path as a string literal.

As an output, the macro must return a constant expression.

Default: A default initializer is constructed by recognizing standard fields.

See example.

template visitors

This is the most flexible customization of the code generation process.

Essentially, a visitor transforms the tree of selected files into code. It does so by calling custom macros at these levels:

  • For the base folder, a visit_base macro is called to wrap everything (top level).
  • For each folder, a visit_folder macro is called, wrapping the code generated from its files and subfolders (recursively).
  • For each file, a visit_file macro is called (bottom level).

These macros are passed the following inputs, separated by comma:

  • visit_base:
    1. Total number of selected files as a usize literal.
    2. Outputs of the visitor applied to the base folder entries, ordered by filename in Unicode code point order.
  • visit_folder:
    1. Folder name as a string literal.
    2. Sanitized folder name as an identifier.
    3. Outputs of the visitor applied to the folder entries, ordered by filename in Unicode code point order.
  • visit_file:
    1. Filename as a string literal.
    2. Sanitized filename as an identifier.
    3. Zero-based index of the file among the selected files as a usize literal.
    4. Relative file path as a string literal. Path components are separated by /.
    5. Absolute file path as a string literal.

The visit_folder macro is optional. If missing, the outputs of the visit_file calls are directly passed as an input to the visit_base call. This is useful to generate flat structures such as arrays. Similarly, the visit_base macro is optional.

You can configure multiple visitors. They are applied in order.

To plug in visitors, add this to your configuration for each visitor:

[[template]]
visit_base = 'visit_my_base'
visit_folder = 'visit_my_folder'
visit_file = 'visit_my_file'

visit_my_… are the names of your corresponding macros.

See examples:

Further resources

Attribute Macros

include_file_tree

See the module level documentation.