A unique feature of unsynn is that one can define a parser as a composition of other
parsers on the fly without the need to define custom structures. This is done by using the
Cons and Either types. The Cons type is used to define a parser that is a
conjunction of two to four other parsers, while the Either type is used to define a
parser that is a disjunction of two to four other parsers.
This module provides parsers for types that contain possibly multiple values. This
includes stdlib types like Option, Vec, Box, Rc, RefCell and types
for delimited and repeated values with numbered repeats.
For easier composition we define the Delimited type here which is a T
followed by a optional delimiting entity D. This is used by the
DelimitedVec type to parse a list of entities separated by a delimiter.
This module contains the fundamental parsers. These parsers are the basic tokens from
proc_macro2 and a few other ones defined by unsynn. These are the terminal entities when
parsing tokens. Being able to parse TokenTree and TokenStream allows one to parse
opaque entities where internal details are left out. The Cached type is used to cache
the string representation of the parsed entity. The Nothing type is used to match
without consuming any tokens. The Except type is used to match when the next token
does not match the given type. The EndOfStream type is used to match the end of the
stream when no tokens are left. The HiddenState type is used to hold additional
information that is not part of the parsed syntax.
Groups are a way to group tokens together. They are used to represent the contents between
(), {}, [] or no delimiters at all. This module provides parser implementations for
opaque group types with defined delimiters and the GroupContaining types that parses the
surrounding delimiters and content of a group type.
This module provides a set of literal types that can be used to parse and tokenize
literals. The literals are parsed from the token stream and can be used to represent the
parsed value. unsynn defines only simplified literals, such as integers, characters and
strings. The literals here are not full rust syntax, which will be defined in the
unsynn-rust crate.
This module contains types for punctuation tokens. These are used to represent single and
multi character punctuation tokens. For single character punctuation tokens, there are
there are PunctAny, PunctAlone and PunctJoint types.
Combined punctuation tokens are represented by Operator. The operator! macro can be
used to define custom operators.
Getting the underlying string expensive as it always allocates a new String.
This type caches the string representation of a given entity. Note that this is
only reliable for fundamental entities that represent a single token. Spacing between
composed tokens is not stable and should be considered informal only.
This is used when one wants to parse a list of entities separated by delimiters. The
delimiter is optional and can be None eg. when the entity is the last in the
list. Usually the delimiter will be some simple punctuation token, but it is not limited
to that.
Since the delimiter in Delimited<T,D> is optional a Vec<Delimited<T,D>> would parse
consecutive values even without delimiters. DelimitedVec<T,D> will stop parsing after
the first value without a delimiter.
Succeeds when the next token matches T. The token will be removed from the stream but not stored.
Consequently the ToTokens implementations will panic with a message that it can not be emitted.
This can only be used when a token should be present but not stored and never emitted.
Sometimes one want to compose types or create structures for unsynn that have members that
are not part of the parsed syntax but add some additional information. This struct can be
used to hold such members while still using the Parser and ToTokens trait
implementations automatically generated by the [unsynn!{}] macro or composition syntax.
HiddenState will not consume any tokens when parsing and will not emit any tokens when
generating a TokenStream. On parsing it is initialized with a default value. It has
Deref and DerefMut implemented to access the inner value.
A Vec<T> that is filled up to the first appearance of an terminating S. This S may
be a subset of T, thus parsing become lazy. This is the same as
Cons<Vec<Cons<Except<S>,T>>,S> but more convenient and efficient.
A literal string ("hello"), byte string (b"hello"), character ('a'),
byte character (b'a'), an integer or floating point number with or without
a suffix (1, 1u8, 2.3, 2.3f32).
A simple unsigned 128 bit integer. This is the most simple form to parse integers. Note
that only decimal integers without any other characters, signs or suffixes are supported,
this is not full rust syntax.
A double quoted string literal ("hello"). The quotes are included in the value. Note
that this is a simplified string literal, and only double quoted strings are supported,
this is not full rust syntax, eg. byte and C string literals are not supported.
A unit that always matches without consuming any tokens. This is required when one wants
to parse a Repeats without a delimiter. Note that using Nothing as primary entity
in a Vec, LazyVec, DelimitedVec or Repeats will result in an infinite
loop.
Operators made from up to four ASCII punctuation characters. Unused characters default to \0.
Custom operators can be defined with the operator! macro. All but the last character are
Spacing::Joint. Attention must be payed when operators have the same prefix, the shorter
ones need to be tried first.
Like DelimitedVec<T,D> but with a minimum and maximum (inclusive) number of elements.
Parsing will succeed when at least the minimum number of elements is reached and stop at
the maximum number. The delimiter D defaults to Nothing to parse sequences which
don’t have delimiters.
Skips over expected tokens. Will parse and consume the tokens but not store them.
Consequently the ToTokens implementations will not output any tokens.
This trait provides the user facing API to parse grammatical entities. It is implemented
for anything that implements the Parser trait. The methods here encapsulating the
iterator that is used for parsing into a transaction. This iterator is always
Copy. Instead using a peekable iterator or implementing deeper peeking, parse clones
this iterator to make access transactional, when parsing succeeds then the transaction
becomes committed, otherwise it is rolled back.
A trait for parsing a repeating T with a minimum and maximum limit.
Sometimes the number of elements to be parsed is determined at runtime eg. a number of
header items needs a matching number of values.
unsynn defines its own ToTokens trait to be able to implement it for std container types.
This is similar to the ToTokens from the quote crate but adds some extra methods and is
implemented for more types. Moreover the to_token_iter() method is the main entry point
for crating an iterator that can be used for parsing.
We track the position of the error by counting tokens. This trait is implemented for
references to shadow counted TokenIter, and usize. The later allows to pass in a
position directly or use usize::MAX in case no position data is available (which will
make this error the be the final one when upgrading).
Type alias for the iterator type we use for parsing. This Iterator is Clone and produces
&TokenTree. The shadow counter counts tokens in the background to track progress which
is used to keep the error that made the most progress in disjunctive parsers.