1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
//! The Flussab crate is a collection of utlities for writing parsers. //! //! Currently Flussab aims to provide just enough to write parsers with a certain combination of //! constraints for which Flussab's author did not find a suitable existing solution. It is not //! intended as a replacement for any such existing solution targeting a different set of //! constraints. //! //! The target use-case are efficient, continuously streaming, interactive, error reporting, //! non-backtracking, recursive-descent parsers for text-based, binary and mixed formats: //! //! * _Efficient_: The straight forward implementation of a parser should be fast, and the library //! should not get in the way when performing more complex optimizations that are possible in //! hand-rolled parsers (e.g. using data parallelism to scan multiple input bytes at once). //! //! The efficiency is realized by a) using an input stream with a mutable cursor instead of //! threading the current position through return values and b) (when using [`ByteReader`]) //! deferring IO error checks. This together greatly simplifies the control and data flow of the //! parsers, which makes it easier for the compiler to optimize the code and often results in //! overall faster code. //! //! Note that deferred error checking does not mean that IO errors are ignored or handled //! imprecisely. Instead when an IO error occurs the parsing logic will exhaust all current //! choices which forces a parse error. At the point where a parse error is generated, outside of //! the hot path, we can then check whether an IO error that occured caused this. //! //! * _Continuously streaming_: It should be possible to parse and process data which does not fit //! into memory, i.e. it is possible to read more data as it is parsed and processed, without //! buffering everything read so far. In particular it is not sufficient to require restarting the //! parser when the input was incomplete nor to require external code that splits a potentially //! infinite input stream into individual chunks. //! //! * _Interactive_: The parser should be usable in a REPL with multi-line input without negatively //! affecting efficiency when parsing data in bulk. This means it must be able to handle input //! data reads that return less bytes than requested, so that line-buffering is usable. //! //! * _Error reporting_: When parsing fails, a parser must be able to generate useful error //! messages. The goal here isn't to replicate the excellent error reporting of e.g. `rustc`, but //! to offer enough information that e.g. a user that generated a gigabyte sized input file using //! a bunch of `println!`s can quickly find the typo they made. //! //! * _Non-backtracking_ (a.k.a. _predictive_): Continuous streaming already requires some limits to //! back tracking. Avoiding backtracking completely also avoids many ways to accidentally making a //! recursive-descent parser become very slow for some inputs. This does mean that some form of //! look-ahead is required to parse most formats. This can be realized either by using the //! provided [`ByteReader`] which has dynamic lookahead, and/or by using a tokenizer for formats //! where no look-ahead is required after tokenization. //! //! (Note that Flussab doesn't stop you from handling backtracking yourself, it just does not //! provide any help for that.) //! //! * _Recursive-descent_: Parsers are written as simple Rust functions that take a mutable //! reference to the input stream and either return the parsed value after consuming some input or //! alternatively indicate failure. This also enables writing parser-combinators as higher level //! functions, although currently Flussab provides only a minimal set of combinators. //! //! The provided infrastructure for combining parsers is entirely agnostic of how the input stream //! is handled, as long as it itself keeps track of the input position. This crate provides //! [`ByteReader`] as one choice, but a [`Peekable`][std::iter::Peekable] iterator of tokens would //! work as well. //! //! ## Using Flussab //! //! Parsers are written very much like manual recursive descent parsers. If our format has a //! _something_ we would have a function like this: //! ```rust //! # use flussab::*; //! # type Something = (); //! # type ParseError = (); //! fn something(input: &mut ByteReader) -> Parsed<Something, ParseError> { //! # let input_as_expected = false; //! // Check whether `input` contains a _something_ at the current position //! if !input_as_expected { //! return Fallthrough; //! } //! // Read _something_ from `input` and advance the input position. //! # let something = (); //! # let parsing_failed_after_advancing_the_input = false; //! # let some_error = (); //! if parsing_failed_after_advancing_the_input { //! return Res(Err(some_error)); //! } //! //! Res(Ok(something)) //! } //! ``` //! //! Here [`Parsed`] and [`ByteReader`] are provided by this crate and are good entry points for the //! documentation. A `Parsed` value is just a wrapper for a [`Result`] which adds [`Fallthrough`] as //! a third option besides `Ok` and `Err` that indicates that the input does not match and that the //! parser did not consume any input. //! //! Instead of manually returning `Fallthrough` or `Res(Err(..))`, often the provided methods of //! [`Parsed`] and [`Result`] are used to combine smaller parsers into larger ones. #![warn(missing_docs)] mod byte_reader; mod byte_writer; mod parser; pub mod text; pub use byte_reader::ByteReader; pub use byte_writer::ByteWriter; pub use parser::{Parsed, Result, ResultExt}; pub use Parsed::*;