Expand description

Allows programmatically invoking parol from a build.rs script

The process of invoking a grammar starts with a [Builder] and one of two output modes:

  1. Cargo build script output mode, via Builder::with_cargo_script_output (easiest)
  2. Explicitly specifying an output directory via Builder::with_explicit_output_dir

Cargo integration

If this API detects it is running inside a Cargo build.rs script, then it implicitly enables cargo integration.

This has Cargo automatically regenerate the parser sources whenever the grammar changes. This is done by implicitly outputting the appropriate rerun-if-changed=<grammar> instructions to Cargo.

Defaults

When using Builder::with_cargo_script_output, a number of reasonable defaults are set:

By default, the output directory is set to the OUT_DIR environment variable. By default, the generated parser name is parser.rs and the generated grammar action file is `

You can

mod parser {
    include!(concat!(env!("OUT_DIR"), "/parser.rs"));
}

Tradeoffs

The disadvantage of using this mode (or using Cargo build scripts in general), is that it adds the parol crate as an explicit build dependency.

Although this doesn’t increase the runtime binary size, it does increase the initial compile times. If someone just wants to cargo install <your crate>, Cargo will have to download and execute parol to generate your parser code.

Contributors to your project (who modify your grammar) will have to download and invoke parol anyways, so this cost primarily affects initial compile times. Also cargo is very intelligent about caching build script outputs, so it really only affects

Despite the impact on initial compiles, this is somewhat traditional in the Rust community. It’s the recommended way to use bindgen and it’s the only way to use pest.

If you are really concerned about compile times, you can use explicit output (below) to avoid invoking pest.

Explicitly controlling Output Locations

If you want more control over the location of generated grammar files, you can invoke Builder::with_explicit_output_dir to explicitly set an output directory.

In addition you must explicitly name your output parser and action files, or the configuration will give an error.

This is used to power the command line parol tool, and is useful for additional control.

Any configured output paths (including generated parsers, expanded grammars, etc) are resolved relative to this base output using Path::join. This means that specifying absolute paths overrides this explicit base directory.

The grammar input file is resolved in the regular manner. It does not use the “output” directory.

Interaction with version control

When using Builder::with_cargo_script_output, the output is put in a subdir of the target directory and excluded from version control.

This is useful if you want to ignore changes in machine-generated code.

However, when specifying an explicit output directory (with Builder::with_explicit_output_dir),

In this case, you would probably set the output to a sub-directory of src. This means that files are version controlled and you would have to commit them whenever changes are made.

Using the CLI directly

Note that explicitly specifying the output directory doesn’t avoid running parol on cargo install.

It does not increase the initial build speed, and still requires compiling and invoking parol.

If you really want to avoid adding parol as a build dependency, you need to invoke the CLI manually to generate the parser sources ahead of time.

Using a build script requires adding a build dependency, and cargo will unconditionally execute build scripts it on first install. While Cargo’s build script caching is excellent, it only activates on recompiles.

As such, using the CLI manually is really the only way to improve (initial) compile times.

It is (often) not worth it, because it is inconvenient, and the impact only happens on initial compiles.

API Completeness

Anything you can do with the main parol executable, you should also be able to do with this API.

That is because the main executable is just a wrapper around the API

However, a couple more advanced features use unstable/internal APIs (see below).

As a side note, the CLI does not require you to specify an output location. You can run parol -f grammar.parol just fine and it will generate no output.

In build scripts, this is typically an mistake (so it errors by default). If you want to disable this sanity check, use Builder::disable_output_sanity_checks

Internal APIs

The main parol command needs a couple of features that do not fit nicely into this API (or interact closely with the crate’s internals).

Because of that, there are a number of APIs explicitly marked as unstable or internal. Some of these are public and some are private.

Expect breaking changes both before and after 1.0 (but especially before).

Structs

Builds the configuration for generating and analyzing parol grammars.
Represents in-process grammar generation.

Enums

An error that occurs configuring the [Builder].
Marks an intermediate stage of the grammar, in between the various transformations that parol does.

Constants

The default maximum lookahead
The default name of the generated grammar module.
The default name of the user type that implements grammar parsing.

Traits

A build listener, for advanced customization of the parser generation.