parol/build.rs
1//! Allows programmatically invoking parol from a `build.rs` script
2//!
3//! The process of invoking a grammar starts with a [`struct@Builder`] and one of two output modes:
4//! 1. Cargo build script output mode, via [Builder::with_cargo_script_output] (easiest)
5//! 2. Explicitly specifying an output directory via [Builder::with_explicit_output_dir]
6//!
7//! ## Cargo integration
8//! If this API detects it is running inside a
9//! [Cargo `build.rs` script](https://doc.rust-lang.org/stable/cargo/reference/build-scripts.html),
10//! then it implicitly enables cargo integration.
11//!
12//! This has Cargo *automatically* regenerate the parser sources whenever the grammar changes. This
13//! is done by implicitly outputting the appropriate
14//! [`rerun-if-changed=<grammar>`](https://doc.rust-lang.org/stable/cargo/reference/build-scripts.html#change-detection)
15//! instructions to Cargo.
16//!
17//! ### Defaults
18//! When using [`Builder::with_cargo_script_output`], a number of reasonable defaults are set:
19//!
20//! By default, the output directory is set to the `OUT_DIR` environment variable.
21//! By default, the generated parser name is `parser.rs` and the generated grammar action file is `
22//!
23//! You can
24//! ```ignore
25//! mod parser {
26//! include!(concat!(env!("OUT_DIR"), "/parser.rs"));
27//! }
28//! ```
29//!
30//! ### Tradeoffs
31//! The disadvantage of using this mode (or using Cargo build scripts in general),
32//! is that it adds the `parol` crate as an explicit build dependency.
33//!
34//! Although this doesn't increase the runtime binary size, it does increase the initial compile
35//! times.
36//! If someone just wants to `cargo install <your crate>`, Cargo will have to download and execute
37//! `parol` to generate your parser code.
38//!
39//! Contributors to your project (who modify your grammar) will have to download and invoke parol
40//! anyways, so this cost primarily affects initial compile times. Also cargo is very intelligent
41//! about caching build script outputs.
42//!
43//! Despite the impact on initial compiles, this is somewhat traditional in the Rust community.
44//! It's [the recommended way to use `bindgen`](https://rust-lang.github.io/rust-bindgen/library-usage.html)
45//! and it's the only way to use [`pest`](https://pest.rs/).
46//!
47//! If you are really concerned about compile times, you can use explicit output (below).
48//!
49//! ## Explicitly controlling Output Locations
50//! If you want more control over the location of generated grammar files,
51//! you can invoke [`Builder::with_explicit_output_dir`] to explicitly set an output directory.
52//!
53//! In addition you must explicitly name your output parser and action files,
54//! or the configuration will give an error.
55//!
56//! This is used to power the command line `parol` tool, and is useful for additional control.
57//!
58//! Any configured *output* paths (including generated parsers, expanded grammars, etc)
59//! are resolved relative to this base output using [Path::join]. This means that specifying
60//! absolute paths overrides this explicit base directory.
61//!
62//! The grammar input file is resolved in the regular manner.
63//! It does not use the "output" directory.
64//!
65//! ### Interaction with version control
66//! When using [`Builder::with_cargo_script_output`], the output is put in a subdir of the `target`
67//! directory and excluded from version control.
68//!
69//! This is useful if you want to ignore changes in generated code.
70//!
71//! However, when specifying an explicit output directory (with [`Builder::with_explicit_output_dir`]),
72//! you may have to include the generated sources explicitly into the build process. One way is
73//! indicated above where the include! macro is used.
74//!
75//! Otherwise, you would probably set the output to a sub-directory of `src`.
76//! This means that files are version controlled and you would have to commit them whenever changes
77//! are made.
78//!
79//! ## Using the CLI directly
80//! Note that explicitly specifying the output directory doesn't avoid running parol on `cargo
81//! install`.
82//!
83//! It does not increase the initial build speed, and still requires compiling and invoking `parol`.
84//!
85//! If you really want to avoid adding `parol` as a build dependency,
86//! you need to invoke the CLI manually to generate the parser sources ahead of time.
87//!
88//! Using a build script requires adding a build dependency, and cargo will unconditionally execute
89//! build scripts on first install.
90//! While Cargo's build script caching is excellent, it only activates on recompiles.
91//!
92//! As such, using the CLI manually is really the only way to improve (initial) compile times.
93//!
94//! It is (often) not worth it, because it is inconvenient, and the impact only happens on *initial* compiles.
95//!
96//! ## API Completeness
97//! Anything you can do with the main `parol` executable, you should also be able to do with this API.
98//!
99//! That is because the main executable is just a wrapper around the API
100//!
101//! However, a couple more advanced features use unstable/internal APIs (see below).
102//!
103//! As a side note, the CLI does not require you to specify an output location.
104//! You can run `parol -f grammar.parol` just fine and it will generate no output.
105//!
106//! In build scripts, this is typically a mistake (so it errors by default).
107//! If you want to disable this sanity check, use [`Builder::disable_output_sanity_checks`]
108//!
109//! ### Internal APIs
110//! The main `parol` command needs a couple of features that do not fit nicely into this API
111//! (or interact closely with the crate's internals).
112//!
113//!
114//! Because of that, there are a number of APIs explicitly marked as unstable or internal.
115//! Some of these are public and some are private.
116//!
117//! Expect breaking changes both before and after 1.0 (but especially before).
118#![deny(missing_docs)]
119
120use std::collections::BTreeMap;
121use std::convert::TryFrom;
122use std::path::{Path, PathBuf};
123use std::{env, fs};
124
125use crate::config::{CommonGeneratorConfig, ParserGeneratorConfig, UserTraitGeneratorConfig};
126use crate::generators::export_node_types::{NodeTypesExporter, NodeTypesInfo};
127use crate::generators::node_kind_enum_generator::NodeKindTypesGenerator;
128use crate::parser::GrammarType;
129use crate::{
130 GrammarConfig, GrammarTypeInfo, LRParseTable, LookaheadDFA, MAX_K, ParolGrammar,
131 UserTraitGenerator,
132};
133use clap::{Parser, ValueEnum};
134use parol_macros::parol;
135use parol_runtime::{ParseTree, Result};
136
137/// Contains all attributes that should be inserted optionally on top of the generated trait source.
138/// * Used in the Builder API. Therefore it mus be public
139#[derive(Clone, Debug, Parser, ValueEnum)]
140pub enum InnerAttributes {
141 /// Suppresses clippy warnings like these: `warning: this function has too many arguments (9/7)`
142 AllowTooManyArguments,
143}
144
145impl std::fmt::Display for InnerAttributes {
146 fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
147 match self {
148 InnerAttributes::AllowTooManyArguments => {
149 write!(f, "#![allow(clippy::too_many_arguments)]")
150 }
151 }
152 }
153}
154
155/// The default maximum lookahead
156///
157/// This is used both for the CLI and for the builder.
158pub const DEFAULT_MAX_LOOKAHEAD: usize = 5;
159/// The default name of the generated grammar module.
160pub const DEFAULT_MODULE_NAME: &str = "grammar";
161/// The default name of the user type that implements grammar parsing.
162pub const DEFAULT_USER_TYPE_NAME: &str = "Grammar";
163
164fn is_build_script() -> bool {
165 // Although only `OUT_DIR` is necessary for our purposes, it's possible someone else set it.
166 // Check for a second one to make sure we're actually running under cargo
167 // See full list of environment variables here: https://is.gd/K6LyzQ
168 env::var_os("OUT_DIR").is_some() && env::var_os("CARGO_MANIFEST_DIR").is_some()
169}
170
171/// Builds the configuration for generating and analyzing `parol` grammars.
172///
173/// A grammar file is required for almost all possible operations (set with [Builder::grammar_file])
174///
175/// Does not actually generate anything until finished.
176#[derive(Clone)]
177pub struct Builder {
178 /// The base output directory
179 output_dir: PathBuf,
180 grammar_file: Option<PathBuf>,
181 /// Output file for expanded grammar
182 expanded_grammar_output_file: Option<PathBuf>,
183 /// Output file for the generated parser source
184 parser_output_file: Option<PathBuf>,
185 /// Output file for the generated actions files.
186 actions_output_file: Option<PathBuf>,
187 /// The output file for the generated syntree node wrappers
188 node_kind_enum_output_file: Option<PathBuf>,
189 pub(crate) user_type_name: String,
190 pub(crate) module_name: String,
191 cargo_integration: bool,
192 max_lookahead: usize,
193 /// By default, we want to require that the parser output file is specified.
194 /// Otherwise we're just wasting time outputting to /dev/null.
195 ///
196 /// The CLI needs to be able to override this (mostly for debugging), hence the option.
197 output_sanity_checks: bool,
198 /// Activate the minimization of boxed types in the generated parser
199 pub(crate) minimize_boxed_types: bool,
200 /// Internal debugging for CLI.
201 debug_verbose: bool,
202 /// Generate range information for AST types
203 range: bool,
204 /// Generate typed syntree node wrappers
205 enum_kind: bool,
206 /// Inner attributes to insert at the top of the generated trait source.
207 inner_attributes: Vec<InnerAttributes>,
208 /// Enables trimming of the parse tree during parsing.
209 /// Generates the call to trim_parse_tree on the parser object before the call of parse.
210 pub(crate) trim_parse_tree: bool,
211 /// Disbales the error recovery mechanism in the generated parser
212 pub(crate) disable_recovery: bool,
213 /// The language to generate code for
214 pub(crate) language: crate::config::Language,
215}
216
217impl Builder {
218 /// Create a new builder fr use in a Cargo build script (`build.rs`).
219 ///
220 /// This is the recommended default way to get started.
221 ///
222 /// All the outputs are set relative to the `OUT_DIR` environment variable,
223 /// as is standard for [Cargo build script outputs](https://doc.rust-lang.org/stable/cargo/reference/build-scripts.html#outputs-of-the-build-script).
224 ///
225 /// This sets sensible defaults for every output file name.
226 ///
227 /// | Method name | CLI Option | Default (relative) name |
228 /// | -------------------------------|----------------------|-------------------------|
229 /// | `parser_output_file` | `--parser` or `-p` | "parser.rs" |
230 /// | `actions_output_file` | `--actions` or `-a` | "grammar_trait.rs" |
231 /// | `expanded_grammar_output_file` | `--expanded` or `-e` | "grammar-exp.par" |
232 ///
233 ///
234 /// See the module documentation for how to include these files into your project.
235 ///
236 /// Panics if used outside of a cargo build script.
237 pub fn with_cargo_script_output() -> Self {
238 assert!(is_build_script(), "Cannot use outside of a cargo script");
239 // Don't worry! $OUT_DIR is unique for every
240 let out_dir = env::var_os("OUT_DIR").unwrap();
241 let mut builder = Self::with_explicit_output_dir(out_dir);
242 // Set those reasonable defaults we promised
243 builder
244 .parser_output_file("parser.rs")
245 .actions_output_file("grammar_trait.rs")
246 .node_kind_enums_output_file("node_kind.rs")
247 .expanded_grammar_output_file("grammar-exp.par");
248 // Cargo integration should already be enabled (because we are a build script)
249 assert!(builder.cargo_integration);
250 builder
251 }
252 /// Internal utility to resolve a path relative to the output directory
253 fn resolve_output_path(&self, p: impl AsRef<Path>) -> PathBuf {
254 self.output_dir.join(p)
255 }
256 /// Create a new builder with an explicitly specified output directory.
257 ///
258 /// This requires that output files be specified explicitly,
259 /// unless this check is disabled with [`Builder::disable_output_sanity_checks`]
260 ///
261 /// If this detects running inside a build script,
262 /// it will automatically enable cargo integration.
263 ///
264 /// If output files are specified using absolute paths,
265 /// it overrides this explicit output dir.
266 ///
267 /// See module docs on "explicit output mode" for more details.
268 pub fn with_explicit_output_dir(output: impl AsRef<Path>) -> Self {
269 /*
270 * Most of these correspond to CLI options.
271 */
272 Builder {
273 output_dir: PathBuf::from(output.as_ref()),
274 grammar_file: None,
275 cargo_integration: is_build_script(),
276 debug_verbose: false,
277 range: false,
278 enum_kind: false,
279 max_lookahead: DEFAULT_MAX_LOOKAHEAD,
280 module_name: String::from(DEFAULT_MODULE_NAME),
281 user_type_name: String::from(DEFAULT_USER_TYPE_NAME),
282 // In this mode, the user must specify explicit outputs.
283 // The default is /dev/null (`None`)
284 parser_output_file: None,
285 actions_output_file: None,
286 node_kind_enum_output_file: None,
287 expanded_grammar_output_file: None,
288 minimize_boxed_types: false,
289 inner_attributes: Vec::new(),
290 // By default, we require that output files != /dev/null
291 output_sanity_checks: true,
292 trim_parse_tree: false,
293 disable_recovery: false,
294 language: crate::config::Language::Rust,
295 }
296 }
297 /// By default, we require that the generated parser and action files are not discarded.
298 ///
299 /// This disables that check (used for the CLI).
300 ///
301 /// NOTE: When using [`Builder::with_cargo_script_output`], these are automatically inferred.
302 pub fn disable_output_sanity_checks(&mut self) -> &mut Self {
303 self.output_sanity_checks = false;
304 self
305 }
306 /// Set the output location for the generated parser.
307 ///
308 /// If you are using [Builder::with_cargo_script_output],
309 /// the default output is "$OUT_DIR/parser.rs".
310 ///
311 /// If you are using an explicitly specified output directory, then this option is *required*.
312 pub fn parser_output_file(&mut self, p: impl AsRef<Path>) -> &mut Self {
313 self.parser_output_file = Some(self.resolve_output_path(p));
314 self
315 }
316 /// Set the actions output location for the generated parser.
317 ///
318 /// If you are using [Builder::with_cargo_script_output],
319 /// the default output is "$OUT_DIR/grammar_trait.rs".
320 ///
321 /// If you are using an explicitly specified output directory, then this option is *required*.
322 pub fn actions_output_file(&mut self, p: impl AsRef<Path>) -> &mut Self {
323 self.actions_output_file = Some(self.resolve_output_path(p));
324 self
325 }
326 /// Set the actions output location for the generated parser.
327 ///
328 /// If you are using [Builder::with_cargo_script_output],
329 /// the default output is "$OUT_DIR/grammar-exp.par".
330 ///
331 /// Otherwise, this is ignored.
332 pub fn expanded_grammar_output_file(&mut self, p: impl AsRef<Path>) -> &mut Self {
333 self.expanded_grammar_output_file = Some(self.resolve_output_path(p));
334 self
335 }
336 /// Set the output location for the generated node kind enum.
337 /// The output does not contain any `parol_runtime` dependencies, so you can specify "../other_crate/src/node_kind.rs" as the output file while the other crate does not have `parol_runtime` as a dependency.
338 ///
339 /// The default location is "$OUT_DIR/node_kind.rs".
340 pub fn node_kind_enums_output_file(&mut self, p: impl AsRef<Path>) -> &mut Self {
341 self.node_kind_enum_output_file = Some(self.resolve_output_path(p));
342 self
343 }
344 /// Explicitly enable/disable cargo integration.
345 ///
346 /// This is automatically set to true if you are running a build script,
347 /// and is `false` otherwise.
348 pub fn set_cargo_integration(&mut self, enabled: bool) -> &mut Self {
349 self.cargo_integration = enabled;
350 self
351 }
352 /// Set the grammar file used as input for parol.
353 ///
354 /// This is required for most operations.
355 ///
356 /// Does not check that the file exists.
357 pub fn grammar_file(&mut self, grammar: impl AsRef<Path>) -> &mut Self {
358 self.grammar_file = Some(PathBuf::from(grammar.as_ref()));
359 self
360 }
361 /// Set the name of the user type that implements the language processing
362 pub fn user_type_name(&mut self, name: &str) -> &mut Self {
363 self.user_type_name = name.into();
364 self
365 }
366 /// Set the name of the user module that implements the language processing
367 ///
368 /// This is the module that contains the [Self::user_type_name]
369 pub fn user_trait_module_name(&mut self, name: &str) -> &mut Self {
370 self.module_name = name.into();
371 self
372 }
373 /// Set the maximum lookahead for the generated parser.
374 ///
375 /// If nothing is specified, the default lookahead is [DEFAULT_MAX_LOOKAHEAD].
376 ///
377 /// Returns a [BuilderError] if the lookahead is greater than [crate::MAX_K].
378 pub fn max_lookahead(&mut self, k: usize) -> std::result::Result<&mut Self, BuilderError> {
379 if k > MAX_K {
380 return Err(BuilderError::LookaheadTooLarge);
381 }
382 self.max_lookahead = k;
383 Ok(self)
384 }
385 /// Debug verbose information to the standard output
386 ///
387 /// This is an internal method, and is only intended for the CLI.
388 #[doc(hidden)]
389 pub fn debug_verbose(&mut self) -> &mut Self {
390 self.debug_verbose = true;
391 self
392 }
393 /// Generate range information for AST types
394 ///
395 pub fn range(&mut self) -> &mut Self {
396 self.range = true;
397 self
398 }
399 /// Generate node kind enums `TerminalKind` and `NonTerminalKind`
400 pub fn node_kind_enums(&mut self) -> &mut Self {
401 self.enum_kind = true;
402 self
403 }
404 /// Inserts the given inner attributes at the top of the generated trait source.
405 pub fn inner_attributes(&mut self, inner_attributes: Vec<InnerAttributes>) -> &mut Self {
406 self.inner_attributes = inner_attributes;
407 self
408 }
409 /// Activate the minimization of boxed types in the generated parser
410 pub fn minimize_boxed_types(&mut self) -> &mut Self {
411 self.minimize_boxed_types = true;
412 self
413 }
414 /// Enables trimming of the parse tree during parsing.
415 /// Generates the call to trim_parse_tree on the parser object before the call of parse.
416 ///
417 pub fn trim_parse_tree(&mut self) -> &mut Self {
418 self.trim_parse_tree = true;
419 self
420 }
421
422 /// Disables the error recovery mechanism in the generated parser
423 pub fn disable_recovery(&mut self) -> &mut Self {
424 self.disable_recovery = true;
425 self
426 }
427
428 /// Set the language to generate code for
429 pub fn language(&mut self, language: crate::config::Language) -> &mut Self {
430 self.language = language;
431 self
432 }
433
434 /// Begin the process of generating the grammar
435 /// using the specified listener (or None if no listener is desired).
436 ///
437 /// Returns an error if the build is *configured* incorrectly.
438 /// In a build script, this is typically a programmer error.
439 pub fn begin_generation_with<'l>(
440 &mut self,
441 listener: Option<&'l mut dyn BuildListener>,
442 ) -> std::result::Result<GrammarGenerator<'l>, BuilderError> {
443 /*
444 * For those concerned about performance:
445 *
446 * The overhead of all these copies and dyn dispatch is marginal
447 * in comparison to the actual grammar generation.
448 */
449 let grammar_file = self
450 .grammar_file
451 .as_ref()
452 .ok_or(BuilderError::MissingGrammarFile)?
453 .clone();
454 if self.output_sanity_checks {
455 // Check that we have outputs
456 if self.parser_output_file.is_none() {
457 return Err(BuilderError::MissingParserOutputFile);
458 } else if self.actions_output_file.is_none() {
459 return Err(BuilderError::MissingActionOutputFile);
460 }
461 // Missing expanded grammar file is fine. They might not want that.
462 }
463 Ok(GrammarGenerator {
464 listener: MaybeBuildListener(listener),
465 grammar_file,
466 builder: self.clone(),
467 state: None,
468 grammar_config: None,
469 lookahead_dfa_s: None,
470 parse_table: None,
471 type_info: None,
472 })
473 }
474 /// Generate the parser, writing it to the pre-configured output files.
475 pub fn generate_parser(&mut self) -> Result<()> {
476 self.begin_generation_with(None)
477 .map_err(|e| parol!("Misconfigured parol generation: {}", e))?
478 .generate_parser()
479 }
480 /// Generate the parser, writing it to the pre-configured output files. And export the node info.
481 pub fn generate_parser_and_export_node_infos(&mut self) -> Result<NodeTypesInfo> {
482 self.begin_generation_with(None)
483 .map_err(|e| parol!("Misconfigured parol generation: {}", e))?
484 .generate_parser_and_export_node_infos()
485 }
486}
487
488impl CommonGeneratorConfig for Builder {
489 fn user_type_name(&self) -> &str {
490 &self.user_type_name
491 }
492
493 fn module_name(&self) -> &str {
494 &self.module_name
495 }
496
497 fn minimize_boxed_types(&self) -> bool {
498 self.minimize_boxed_types
499 }
500
501 fn range(&self) -> bool {
502 self.range
503 }
504
505 fn node_kind_enums(&self) -> bool {
506 self.enum_kind
507 }
508
509 fn language(&self) -> crate::config::Language {
510 self.language
511 }
512}
513
514impl ParserGeneratorConfig for Builder {
515 fn trim_parse_tree(&self) -> bool {
516 self.trim_parse_tree
517 }
518
519 fn recovery_disabled(&self) -> bool {
520 self.disable_recovery
521 }
522}
523
524impl UserTraitGeneratorConfig for Builder {
525 fn inner_attributes(&self) -> &[InnerAttributes] {
526 &self.inner_attributes
527 }
528}
529
530/// Represents in-process grammar generation.
531///
532/// Most of the time you will want to use [Builder::generate_parser] to bypass this completely.
533///
534/// This is an advanced API, and unless stated otherwise, all its methods are unstable (see module docs).
535///
536/// The lifetime parameter `'l` refers to the lifetime of the optional listener.
537pub struct GrammarGenerator<'l> {
538 /// The build listener
539 ///
540 /// This is a fairly advanced feature
541 listener: MaybeBuildListener<'l>,
542 pub(crate) grammar_file: PathBuf,
543 builder: Builder,
544 state: Option<State>,
545 pub(crate) grammar_config: Option<GrammarConfig>,
546 lookahead_dfa_s: Option<BTreeMap<String, LookaheadDFA>>,
547 parse_table: Option<LRParseTable>,
548 type_info: Option<GrammarTypeInfo>,
549}
550impl GrammarGenerator<'_> {
551 /// Generate the parser, writing it to the pre-configured output files.
552 pub fn generate_parser(&mut self) -> Result<()> {
553 self.parse()?;
554 self.expand()?;
555 self.post_process()?;
556 self.write_output()?;
557 Ok(())
558 }
559
560 /// Generate the parser, writing it to the pre-configured output files. And export the node info.
561 pub fn generate_parser_and_export_node_infos(&mut self) -> Result<NodeTypesInfo> {
562 self.parse()?;
563 self.expand()?;
564 self.post_process()?;
565 self.write_output()?;
566 self.export_node_infos()
567 }
568
569 //
570 // Internal APIs
571 //
572
573 #[doc(hidden)]
574 pub fn parse(&mut self) -> Result<()> {
575 assert_eq!(self.state, None);
576 let input = fs::read_to_string(&self.grammar_file).map_err(|e| {
577 parol!(
578 "Can't read grammar file {}: {}",
579 self.grammar_file.display(),
580 e
581 )
582 })?;
583 if self.builder.cargo_integration {
584 println!("cargo:rerun-if-changed={}", self.grammar_file.display());
585 }
586 let mut parol_grammar = ParolGrammar::new();
587 let syntax_tree = crate::parser::parse(&input, &self.grammar_file, &mut parol_grammar)?;
588 self.listener
589 .on_initial_grammar_parse(&syntax_tree, &input, &parol_grammar)?;
590 self.grammar_config = Some(GrammarConfig::try_from(parol_grammar)?);
591
592 let _grammar_config = self.grammar_config.as_ref().unwrap();
593
594 self.state = Some(State::Parsed);
595 Ok(())
596 }
597 #[doc(hidden)]
598 pub fn expand(&mut self) -> Result<()> {
599 assert_eq!(self.state, Some(State::Parsed));
600 let grammar_config = self.grammar_config.as_mut().unwrap();
601 // NOTE: it's up to the listener to add appropriate error context
602 self.listener
603 .on_intermediate_grammar(IntermediateGrammar::Untransformed, &*grammar_config)?;
604 let cfg =
605 crate::check_and_transform_grammar(&grammar_config.cfg, grammar_config.grammar_type)?;
606
607 // To have at least a preliminary version of the expanded grammar,
608 // even when the next checks fail, we write out the expanded grammar here.
609 // In most cases it will be overwritten further on.
610 if let Some(ref expanded_file) = self.builder.expanded_grammar_output_file {
611 fs::write(
612 expanded_file,
613 crate::render_par_string(grammar_config, /* add_index_comment */ true)?,
614 )
615 .map_err(|e| parol!("Error writing left-factored grammar! {}", e))?;
616 }
617
618 // Exchange original grammar with transformed one
619 grammar_config.update_cfg(cfg);
620
621 self.listener
622 .on_intermediate_grammar(IntermediateGrammar::Transformed, &*grammar_config)?;
623 if let Some(ref expanded_file) = self.builder.expanded_grammar_output_file {
624 fs::write(
625 expanded_file,
626 crate::render_par_string(grammar_config, /* add_index_comment */ true)?,
627 )
628 .map_err(|e| parol!("Error writing left-factored grammar!: {}", e))?;
629 }
630 self.state = Some(State::Expanded);
631 Ok(())
632 }
633 #[doc(hidden)]
634 pub fn post_process(&mut self) -> Result<()> {
635 assert_eq!(self.state, Some(State::Expanded));
636 let grammar_config = self.grammar_config.as_mut().unwrap();
637 match grammar_config.grammar_type {
638 GrammarType::LLK => {
639 self.lookahead_dfa_s = Some(
640 crate::calculate_lookahead_dfas(grammar_config, self.builder.max_lookahead)
641 .map_err(|e| {
642 parol!("Lookahead calculation for the given grammar failed!: {}", e)
643 })?,
644 );
645
646 if self.builder.debug_verbose {
647 print!(
648 "Lookahead DFAs:\n{:?}",
649 self.lookahead_dfa_s.as_ref().unwrap()
650 );
651 }
652
653 // Update maximum lookahead size for scanner generation
654 grammar_config.update_lookahead_size(
655 self.lookahead_dfa_s
656 .as_ref()
657 .unwrap()
658 .iter()
659 .max_by_key(|(_, dfa)| dfa.k)
660 .unwrap()
661 .1
662 .k,
663 );
664 }
665 GrammarType::LALR1 => {
666 self.parse_table = Some(crate::calculate_lalr1_parse_table(grammar_config)?.0);
667 grammar_config.update_lookahead_size(1);
668 }
669 }
670
671 if self.builder.debug_verbose {
672 print!("\nGrammar config:\n{grammar_config:?}");
673 }
674 self.state = Some(State::PostProcessed);
675 Ok(())
676 }
677 #[doc(hidden)]
678 pub fn write_output(&mut self) -> Result<()> {
679 assert_eq!(self.state, Some(State::PostProcessed));
680 let grammar_config = self.grammar_config.as_mut().unwrap();
681
682 let language = self.builder.language();
683
684 let lexer_source = match language {
685 crate::config::Language::Rust => {
686 crate::generate_lexer_source(grammar_config, &self.builder)
687 .map_err(|e| parol!("Failed to generate lexer source!: {}", e))?
688 }
689 crate::config::Language::CSharp => {
690 crate::generators::cs_lexer_generator::generate_lexer_source(
691 grammar_config,
692 &self.builder,
693 )
694 .map_err(|e| parol!("Failed to generate C# lexer source!: {}", e))?
695 }
696 };
697
698 let mut type_info: GrammarTypeInfo =
699 GrammarTypeInfo::try_new(&self.builder.user_type_name)?;
700
701 let user_trait_source = match language {
702 crate::config::Language::Rust => {
703 let user_trait_generator = UserTraitGenerator::new(grammar_config);
704 user_trait_generator.generate_user_trait_source(
705 &self.builder,
706 grammar_config.grammar_type,
707 &mut type_info,
708 )?
709 }
710 crate::config::Language::CSharp => {
711 let user_trait_generator =
712 crate::generators::cs_user_trait_generator::CSUserTraitGenerator::new(
713 grammar_config,
714 );
715 user_trait_generator.generate_user_trait_source(
716 &self.builder,
717 grammar_config.grammar_type,
718 &mut type_info,
719 )?
720 }
721 };
722
723 if let Some(ref user_trait_file_out) = self.builder.actions_output_file {
724 fs::write(user_trait_file_out, user_trait_source)
725 .map_err(|e| parol!("Error writing generated user trait source!: {}", e))?;
726 if language == crate::config::Language::Rust {
727 crate::try_format(user_trait_file_out)?;
728 }
729 } else if self.builder.debug_verbose {
730 println!("\nSource for semantic actions:\n{user_trait_source}");
731 }
732
733 let ast_type_has_lifetime = type_info.symbol_table.has_lifetime(type_info.ast_enum_type);
734
735 let parser_source = match language {
736 crate::config::Language::Rust => match grammar_config.grammar_type {
737 GrammarType::LLK => crate::generate_parser_source(
738 grammar_config,
739 &lexer_source,
740 &self.builder,
741 self.lookahead_dfa_s.as_ref().unwrap(),
742 ast_type_has_lifetime,
743 )?,
744 GrammarType::LALR1 => crate::generate_lalr1_parser_source(
745 grammar_config,
746 &lexer_source,
747 &self.builder,
748 self.parse_table.as_ref().unwrap(),
749 ast_type_has_lifetime,
750 )?,
751 },
752 crate::config::Language::CSharp => match grammar_config.grammar_type {
753 GrammarType::LLK => crate::generators::cs_parser_generator::generate_parser_source(
754 grammar_config,
755 &lexer_source,
756 &self.builder,
757 self.lookahead_dfa_s.as_ref().unwrap(),
758 ast_type_has_lifetime,
759 )?,
760 GrammarType::LALR1 => {
761 crate::generators::cs_parser_generator::generate_lalr1_parser_source(
762 grammar_config,
763 &lexer_source,
764 &self.builder,
765 self.parse_table.as_ref().unwrap(),
766 ast_type_has_lifetime,
767 )?
768 }
769 },
770 };
771
772 if let Some(ref parser_file_out) = self.builder.parser_output_file {
773 fs::write(parser_file_out, parser_source)
774 .map_err(|e| parol!("Error writing generated lexer source!: {}", e))?;
775 if language == crate::config::Language::Rust {
776 crate::try_format(parser_file_out)?;
777 }
778 } else if self.builder.debug_verbose {
779 println!("\nParser source:\n{parser_source}");
780 }
781
782 if let Some(ref syntree_node_wrappers_output_file) = self.builder.node_kind_enum_output_file
783 {
784 let mut f = fs::OpenOptions::new()
785 .write(true)
786 .create(true)
787 .truncate(true)
788 .open(syntree_node_wrappers_output_file)
789 .map_err(|e| parol!("Error opening generated syntree node wrappers!: {}", e))?;
790 let syntree_node_types_generator =
791 NodeKindTypesGenerator::new(grammar_config, &type_info);
792 syntree_node_types_generator
793 .generate(&mut f)
794 .map_err(|e| parol!("Error generating syntree node wrappers!: {}", e))?;
795 crate::try_format(syntree_node_wrappers_output_file)?;
796 }
797
798 self.state = Some(State::Finished);
799 self.type_info = Some(type_info);
800
801 Ok(())
802 }
803
804 fn export_node_infos(&self) -> Result<NodeTypesInfo> {
805 let node_types_exporter = NodeTypesExporter::new(
806 self.grammar_config.as_ref().unwrap(),
807 self.type_info.as_ref().unwrap(),
808 );
809 Ok(node_types_exporter.generate())
810 }
811}
812
813#[derive(Clone, Copy, Debug, PartialEq, Eq)]
814enum State {
815 Parsed,
816 Expanded,
817 PostProcessed,
818 Finished,
819}
820
821/// A build listener, for advanced customization of the parser generation.
822///
823/// This is used by the CLI to implement some of its more advanced options (without cluttering up the main interface).
824///
825/// The details of this trait are considered unstable.
826#[allow(
827 unused_variables, // All these variables are going to be unused because these are NOP impls....
828 missing_docs, // This is fine because this is internal.
829)]
830pub trait BuildListener {
831 fn on_initial_grammar_parse(
832 &mut self,
833 syntax_tree: &ParseTree,
834 input: &str,
835 grammar: &ParolGrammar,
836 ) -> Result<()> {
837 Ok(())
838 }
839 fn on_intermediate_grammar(
840 &mut self,
841 stage: IntermediateGrammar,
842 config: &GrammarConfig,
843 ) -> Result<()> {
844 Ok(())
845 }
846}
847#[derive(Default)]
848struct MaybeBuildListener<'l>(Option<&'l mut dyn BuildListener>);
849impl BuildListener for MaybeBuildListener<'_> {
850 fn on_initial_grammar_parse(
851 &mut self,
852 syntax_tree: &ParseTree,
853 input: &str,
854 grammar: &ParolGrammar,
855 ) -> Result<()> {
856 if let Some(ref mut inner) = self.0 {
857 inner.on_initial_grammar_parse(syntax_tree, input, grammar)
858 } else {
859 Ok(())
860 }
861 }
862
863 fn on_intermediate_grammar(
864 &mut self,
865 stage: IntermediateGrammar,
866 config: &GrammarConfig,
867 ) -> Result<()> {
868 if let Some(ref mut inner) = self.0 {
869 inner.on_intermediate_grammar(stage, config)
870 } else {
871 Ok(())
872 }
873 }
874}
875
876/// Marks an intermediate stage of the grammar, in between the various transformations that parol does.
877///
878/// The last transformation is returned by [IntermediateGrammar::LAST]
879///
880/// This enum gives some degree of access to the individual transformations that parol does.
881/// As such, the specific variants are considered unstable.
882#[non_exhaustive]
883#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
884pub enum IntermediateGrammar {
885 /// Writes the untransformed parsed grammar
886 ///
887 /// NOTE: This is different then the initially parsed syntax tree
888 Untransformed,
889 /// Writes the transformed parsed grammar
890 Transformed,
891}
892impl IntermediateGrammar {
893 /// The last transformation.
894 pub const LAST: IntermediateGrammar = IntermediateGrammar::Transformed;
895}
896
897/// An error that occurs configuring the [struct@Builder].
898#[derive(Debug, thiserror::Error)]
899#[non_exhaustive]
900pub enum BuilderError {
901 /// Indicates that the operation needs a grammar file as input,
902 /// but that one has not been specified.
903 #[error("Missing an input grammar file")]
904 MissingGrammarFile,
905 /// Indicates that no parser output file has been specified.
906 ///
907 /// This would discard the generated parser, which is typically a mistake.
908 #[error("No parser output file specified")]
909 MissingParserOutputFile,
910 /// Indicates that no parser output file has been specified.
911 ///
912 /// This would discard the generated parser, which is typically a mistake.
913 #[error("No action output file specified")]
914 MissingActionOutputFile,
915 /// Indicates that the specified lookahead is too large
916 #[error("Maximum lookahead is {}", MAX_K)]
917 LookaheadTooLarge,
918}