Crate facet_macros_parse

Crate facet_macros_parse 

Source
Expand description

§facet-macros-parse

Coverage Status crates.io documentation MIT/Apache-2.0 licensed Discord

Parses Rust syntax for facet-macros

§Sponsors

Thanks to all individual sponsors:

GitHub Sponsors Patreon

…along with corporate sponsors:

AWS Zed Depot

…without whom this work could not exist.

§Special thanks

The facet logo was drawn by Misiasart.

§License

Licensed under either of:

at your option.

Modules§

CHANGELOG
Changelog
combinator
A unique feature of unsynn is that one can define a parser as a composition of other parsers on the fly without the need to define custom structures. This is done by using the Cons and Either types. The Cons type is used to define a parser that is a conjunction of two to four other parsers, while the Either type is used to define a parser that is a disjunction of two to four other parsers.
container
This module provides parsers for types that contain possibly multiple values. This includes stdlib types like Option, Vec, Box, Rc, RefCell and types for delimited and repeated values with numbered repeats.
debug
Debug utilities for token stream inspection
delimited
For easier composition we define the Delimited type here which is a T followed by a optional delimiting entity D. This is used by the DelimitedVec type to parse a list of entities separated by a delimiter.
dynamic
This module contains the types for dynamic transformations after parsing.
expressions
Expression parser building blocks for creating operator precedence parsers.
fundamental
This module contains the fundamental parsers. These are the basic tokens from proc_macro2/proc_macro and a few other ones defined by unsynn. These are the terminal entities when parsing tokens. Being able to parse TokenTree and TokenStream allows one to parse opaque entities where internal details are left out. The Cached type is used to cache the string representation of the parsed entity. The Nothing type is used to match without consuming any tokens. The Except type is used to match when the next token does not match the given type. The EndOfStream type is used to match the end of the stream when no tokens are left. The HiddenState type is used to hold additional information that is not part of the parsed syntax.
group
Groups are a way to group tokens together. They are used to represent the contents between (), {}, [] or no delimiters at all. This module provides parser implementations for opaque group types with defined delimiters and the GroupContaining types that parses the surrounding delimiters and content of a group type.
literal
This module provides a set of literal types that can be used to parse and tokenize literals. The literals are parsed from the token stream and can be used to represent the parsed value. unsynn defines only simplified literals, such as integers, characters and strings. The literals here are not full rust syntax, which will be defined in the unsynn-rust crate. There are Literal* for Integer, Character, String to parse simple literals and ConstInteger<V> and ConstCharacter<V> who must match an exact character. The later two also implement Default, thus they can be used to create constant tokens. There is no ConstString; constant literal strings can be constructed with IntoLiteralString<T>.
operator
Combined punctuation tokens are represented by Operator. The crate::operator! macro can be used to define custom operators.
predicates
Parse predicates for compile-time parser control.
punct
This module contains types for punctuation tokens. These are used to represent single and multi character punctuation tokens. For single character punctuation tokens, there are there are PunctAny, PunctAlone and PunctJoint types.
rust_types
Parsers for rusts types.
transform
This module contains the transforming parsers. This are the parsers that add, remove, replace or reorder Tokens while parsing.

Macros§

assert_tokens_eq
Helper macro that asserts that two entities implementing ToTokens result in the same TokenStream. Used in tests to ensure that the output of parsing is as expected. This macro allows two forms:
format_cached_ident
Generates a CachedIdent from a format specification.
format_ident
Generates a Ident from a format specification.
format_literal
Generates a Literal from a format specification. Unlike format_literal_string!, this does not add quotes and can be used to create any kind of literal, such as integers or floats.
format_literal_string
Generates a LiteralString from a format specification. Quote characters around the string are automatically added.
operator
Define types matching operators (punctuation sequences).
quote
unsynn provides its own quote!{} macro that translates tokens into a TokenStream while interpolating variables prefixed with a Pound sign (#). This is similar to what the quote macro from the quote crate does but not as powerful. There is no #(...) repetition (yet).

Structs§

AllOf
Logical AND: 2-4 predicates must all succeed.
AngleTokenTree
Parses either a TokenTree or <...> grouping (which is not a Group as far as proc-macros are concerned).
AnyOf
Logical OR: 2-4 predicates, at least one must succeed.
Attribute
Represents an attribute annotation on a field, typically in the form #[attr].
BraceGroup
A opaque group of tokens within a Brace
BraceGroupContaining
Parseable content within a Brace
BracketGroup
A opaque group of tokens within a Bracket
BracketGroupContaining
Parseable content within a Bracket
Cached
Getting the underlying string expensive as it always allocates a new String. This type caches the string representation of a given entity. Note that this is only reliable for fundamental entities that represent a single token. Spacing between composed tokens is not stable and should be considered informal only.
ChildInner
Inner value for #[facet(child)]
Cons
Conjunctive A followed by B and optional C and D When C and D are not used, they are set to Nothing.
ConstCharacter
A constant char of value V. Must match V and also has Default implemented to create a LiteralCharacter with value V.
ConstInteger
A constant u128 integer of value V. Must match V and also has Default implemented to create a LiteralInteger with value V.
DefaultEqualsInner
Inner value for #[facet(default = …)]
Delimited
This is used when one wants to parse a list of entities separated by delimiters. The delimiter is optional and can be None eg. when the entity is the last in the list. Usually the delimiter will be some simple punctuation token, but it is not limited to that.
DelimitedVec
Since the delimiter in Delimited<T,D> is optional a Vec<Delimited<T,D>> would parse consecutive values even without delimiters. DelimitedVec<T, D, MIN, MAX, P> will stop parsing by MIN/MAX number of elements and depending on the policy defined by P which can be one of TrailingDelimiter.
DeserializeWithInner
Inner value for #[facet(deserialize_with = …)]
Disable
Always fails without consuming tokens.
Discard
Succeeds when the next token matches T. The token will be removed from the stream but not stored. Consequently the ToTokens implementations will panic with a message that it can not be emitted. This can only be used when a token should be present but not stored and never emitted.
DocInner
Represents documentation for an item.
DynNode
Parses a T (default: Nothing). Allows one to replace it at runtime, after parsing with anything else implementing ToTokens. This is backed by a Rc. One can replace any cloned occurrences or only the current one.
Enable
Always succeeds without consuming tokens.
EndOfStream
Matches the end of the stream when no tokens are left.
Enum
Represents an enum definition. e.g., #[repr(u8)] pub enum MyEnum<T> where T: Clone { Variant1, Variant2(T) }.
EnumVariantLike
Represents a variant of an enum, including the optional discriminant value
Error
Error type for parsing.
Except
Succeeds when the next token does not match T. Will not consume any tokens. Usually this has to be followed with a conjunctive match such as Cons<Except<T>, U> or followed by another entry in a struct or tuple.
Expect
Succeeds when the next token would match T. Will not consume any tokens. This is similar to peeking.
FacetAttr
Represents a facet attribute that can contain specialized metadata.
FlattenInner
Inner value for #[facet(flatten)]
GenericParams
Represents the generic parameters of a struct or enum definition, enclosed in angle brackets. e.g., <'a, T: Trait, const N: usize>.
Group
A delimited token stream.
GroupContaining
Any kind of Group G with parseable content C. The content C must parse exhaustive, an EndOfStream is automatically implied.
HiddenState
Sometimes one want to compose types or create structures for unsynn that have members that are not part of the parsed syntax but add some additional information. This struct can be used to hold such members while still using the Parser and ToTokens trait implementations automatically generated by the [unsynn!{}] macro or composition syntax. HiddenState will not consume any tokens when parsing and will not emit any tokens when generating a TokenStream. On parsing it is initialized with a default value. It has Deref and DerefMut implemented to access the inner value.
Ident
A word of Rust code, which may be a keyword or legal variable name.
Insert
Injects tokens without parsing anything.
IntoIdent
Parses T and concats all its elements to a single identifier by removing all characters that are not valid in identifiers. When T implements Default, such as single string (non group) keywords, operators and Const* literals. Then it can be used to create IntoIdentifier on the fly. Note that construction may still fail when one tries to create a invalid identifier such as one starting with digits for example.
IntoLiteralString
Parses T and creates a LiteralString from it. When T implements Default, such as single string (non group) keywords, operators and Const* literals. It can be used to create IntoLiteralString on the fly.
IntoTokenStream
Parses T and keeps it as opaque TokenStream. This is useful when one wants to parse a sequence of tokens and keep it as opaque unit or re-parse it later as something else.
Invalid
A unit that always fails to match. This is useful as default for generics. See how Either<A, B, C, D> uses this for unused alternatives.
InvariantInner
Represents invariants for a type.
KChild
The “child” keyword Matches: child,
KConst
The “const” keyword. Matches: const,
KCrate
The “crate” keyword. Matches: crate,
KDefault
The “default” keyword. Matches: default,
KDenyUnknownFields
The “deny_unknown_fields” keyword. Matches: deny_unknown_fields,
KDeserializeWith
The “deserialize_with” keyword. Matches: deserialize_with,
KDoc
The “doc” keyword. Matches: doc,
KEnum
The “enum” keyword. Matches: enum,
KFacet
The “facet” keyword. Matches: facet,
KFlatten
The “flatten” keyword Matches: flatten,
KIn
The “in” keyword. Matches: in,
KInvariants
The “invariants” keyword. Matches: invariants,
KMut
The “mut” keyword. Matches: mut,
KOpaque
The “opaque” keyword. Matches: opaque,
KPub
The “pub” keyword. Matches: pub,
KRename
The “rename” keyword. Matches: rename,
KRenameAll
The “rename_all” keyword. Matches: rename_all,
KRepr
The “repr” keyword. Matches: repr,
KSensitive
The “sensitive” keyword. Matches: sensitive,
KSerializeWith
The “serialize_with” keyword. Matches: serialize_with,
KSkipSerializing
The “skip_serializing” keyword. Matches: skip_serializing,
KSkipSerializingIf
The “skip_serializing_if” keyword. Matches: skip_serializing_if,
KStruct
The “struct” keyword. Matches: struct,
KTransparent
The “transparent” keyword. Matches: transparent,
KTypeTag
The “type_tag” keyword. Matches: type_tag,
KWhere
The “where” keyword. Matches: where,
LazyVec
A Vec<T> that is filled up to the first appearance of an terminating S. This S may be a subset of T, thus parsing become lazy. This is the same as Cons<Vec<Cons<Except<S>,T>>,S> but more convenient and efficient.
LazyVecUntil
A Vec<T> that is filled up to the first appearance of an terminating S. This S may be a subset of T, thus parsing become lazy. Unlike LazyVec this variant does not consume the final terminator. This is the same as Vec<Cons<Except<S>,T>>> but more convenient.
LeftAssocExpr
Left-associative infix operator expression.
Lifetime
Represents a lifetime annotation, like 'a.
Literal
A literal string ("hello"), byte string (b"hello"), character ('a'), byte character (b'a'), an integer or floating point number with or without a suffix (1, 1u8, 2.3, 2.3f32).
LiteralCharacter
A single quoted character literal ('x').
LiteralInteger
A simple unsigned 128 bit integer. This is the most simple form to parse integers. Note that only decimal integers without any other characters, signs or suffixes are supported, this is not full rust syntax.
LiteralString
A double quoted string literal ("hello"). The quotes are included in the value. Note that this is a simplified string literal, and only double quoted strings are supported, this is not full rust syntax, eg. byte and C string literals are not supported.
NonEmptyOption
NonEmptyOption<T> prevents Option from matching when T can succeed with empty input. It ensures None is returned when no tokens remain, regardless of whether T could succeed on an empty stream. This is crucial when parsing optional trailing content that should only match if tokens are actually available to consume.
NonEmptyTokenStream
Since parsing a TokenStream succeeds even when no tokens are left, this type is used to parse a TokenStream that is not empty.
NonParseable
A unit that can not be parsed. This is useful as diagnostic placeholder for parsers that are (yet) unimplemented. The nonparseable feature flag controls if Parser and ToTokens will be implemented for it. This is useful in release builds that should not have any NonParseable left behind.
NoneGroup
A opaque group of tokens within a None
NoneGroupContaining
Parseable content within a None
Not
Logical NOT: succeeds if inner predicate fails.
Nothing
A unit that always matches without consuming any tokens. This is required when one wants to parse a Repeats without a delimiter. Note that using Nothing as primary entity in a Vec, LazyVec, DelimitedVec or Repeats will result in an infinite loop.
OneOf
Logical XOR: exactly one of 2-4 predicates must succeed.
Operator
Operators made from up to four ASCII punctuation characters. Unused characters default to \0. Custom operators can be defined with the crate::operator! macro. All but the last character are Spacing::Joint. Attention must be payed when operators have the same prefix, the shorter ones need to be tried first.
ParenthesisGroup
A opaque group of tokens within a Parenthesis
ParenthesisGroupContaining
Parseable content within a Parenthesis
PostfixExpr
Postfix unary operator expression.
PredicateCmp
Predicate that compares type A with type B at runtime.
PrefixExpr
Prefix unary operator expression.
Punct
A Punct is a single punctuation character like +, - or #.
PunctAlone
A single character punctuation token which is not followed by another punctuation character.
PunctAny
A single character punctuation token with any kind of Spacing,
PunctJoint
A single character punctuation token where the lexer joined it with the next Punct or a single quote followed by a identifier (rust lifetime).
RenameAllInner
Inner value for #[facet(rename_all = …)]
RenameInner
Inner value for #[facet(rename = …)]
ReprInner
Represents the inner content of a repr attribute, typically used for specifying memory layout or representation hints.
RightAssocExpr
Right-associative infix operator expression.
SerializeWithInner
Inner value for #[facet(serialize_with = …)]
Skip
Skips over expected tokens. Will parse and consume the tokens but not store them. Consequently the ToTokens implementations will not output any tokens.
SkipSerializingIfInner
Inner value for #[facet(skip_serializing_if = …)]
SkipSerializingInner
Inner value for #[facet(skip_serializing)]
Span
A region of source code, along with macro expansion information.
StderrLog
A no-op debug parser that does nothing when the debug_grammar feature is disabled.
Struct
Represents a struct definition.
StructEnumVariant
Represents a struct-like enum variant. e.g., MyVariant { field1: u32, field2: String }.
StructField
Represents a field within a regular struct definition. e.g., pub name: String.
Swap
Swaps the order of two entities.
TokenIter
Iterator type for parsing token streams.
TokenStream
An abstract stream of tokens, or more concretely a sequence of token trees.
TokensRemain
Succeeds only when tokens remain in the stream.
TupleField
Represents a field within a tuple struct definition. e.g., pub String.
TupleVariant
Represents a tuple-like enum variant. e.g., MyVariant(u32, String).
TypeTagInner
Inner value for #[facet(type_tag = …)]
UnitVariant
Represents a unit-like enum variant. e.g., MyVariant.
VerbatimDisplay
Display the verbatim tokens until the given token.
WhereClause
Represents a single predicate within a where clause. e.g., T: Trait or 'a: 'b.
WhereClauses
Represents a where clause attached to a definition. e.g., where T: Trait, 'a: 'b.

Enums§

AdtDecl
Represents an algebraic data type (ADT) declaration, which can be either a struct or enum.
AttributeInner
Represents the inner content of an attribute annotation.
ConstOrMut
Represents either the const or mut keyword, often used with pointers.
Delimiter
Describes how a sequence of token trees is delimited.
Either
Disjunctive A or B or optional C or D tried in that order. When C and D are not used, they are set to Invalid.
EnumVariantData
Represents the different kinds of variants an enum can have.
ErrorKind
Actual kind of an error.
Expr
Represents a simple expression, currently only integer literals. Used potentially for const generic default values.
FacetInner
Represents the inner content of a facet attribute.
GenericParam
Represents a single generic parameter within a GenericParams list.
LifetimeOrTt
A lifetime or a tokentree, used to gather lifetimes in type definitions
Spacing
Whether a Punct is followed immediately by another Punct or followed by another token or whitespace.
StructKind
Represents the kind of a struct definition.
TokenTree
A single token or a delimited sequence of token trees (e.g. [1, (), ..]).
Vis
Represents visibility modifiers for items.

Traits§

DynamicTokens
Trait alias for any type that can be used in dynamic ToTokens contexts.
GroupDelimiter
Access to the surrounding Delimiter of a GroupContaining and its variants.
IParse
Extension trait for TokenIter that calls Parse::parse().
IntoTokenIter
Extension trait to convert iterators into TokenIter.
Parse
This trait provides the user facing API to parse grammatical entities. It is implemented for anything that implements the Parser trait. The methods here encapsulating the iterator that is used for parsing into a transaction. This iterator is always Clone. Instead using a peekable iterator or implementing deeper peeking, parse clones this iterator to make access transactional, when parsing succeeds then the transaction becomes committed, otherwise it is rolled back.
Parser
The Parser trait that must be implemented by anything we want to parse. We are parsing over a TokenIter (TokenStream iterator).
PredicateOp
Marker trait for compile-time parser predicates.
RangedRepeats
A trait for parsing a repeating T with a minimum and maximum limit. Sometimes the number of elements to be parsed is determined at runtime eg. a number of header items needs a matching number of values.
RefineErr
Helper Trait for refining error type names. Every parser type in unsynn eventually tries to parse one of the fundamental types. When parsing fails then that fundamental type name is recorded as expected type name of the error. Often this is not desired, a user wants to know the type of parser that actually failed. Since we don’t want to keep a stack/vec of errors for simplicity and performance reasons we provide a way to register refined type names in errors. Note that this refinement should only be applied to leaves in the AST. Refining errors on composed types will lead to unexpected results.
ToTokenIter
Extension trait to convert TokenStreams into TokenIter.
ToTokens
unsynn defines its own ToTokens trait to be able to implement it for std container types. This is similar to the ToTokens from the quote crate but adds some extra methods and is implemented for more types. Moreover the to_token_iter() method is the main entry point for crating an iterator that can be used for parsing.
TokenCount
We track the position of the error by counting tokens. This trait is implemented for references to shadow counted TokenIter, and usize. The later allows to pass in a position directly or use usize::MAX in case no position data is available (which will make this error the be the final one when upgrading).
Transaction
Helper trait to make TokenIter transactional

Type Aliases§

And
&
AndAnd
&&
AndEq
&=
Any
Any number of T delimited by D or Nothing
Apostrophe
Represents the apostrophe ‘'’ operator. '
AsDefault
Parse a T and replace it with its default value. This is a zero sized type. It can be used for no allocation replacement elements in a Vec since it has a optimization for zero-sized-types where it wont allocate any memory but just act as counter then.
Assign
=
At
@
AtLeast
At least N of T delimited by D or Nothing
AtMost
At most N of T delimited by D or Nothing
Backslash
\
Bang
!
Bounds
Represents type bounds, consisting of a colon followed by tokens until a comma, equals sign, or closing angle bracket is encountered.
CachedGroup
Group with cached string representation.
CachedIdent
Ident with cached string representation.
CachedLiteral
Literal with cached string representation.
CachedLiteralInteger
LiteralInteger with cached string representation.
CachedLiteralString
LiteralString with cached string representation.
CachedPunct
Punct with cached string representation.
CachedTokenTree
TokenTree (any token) with cached string representation.
Caret
^
CaretEq
^=
Colon
:
ColonDelimited
T followed by an optional :
ColonDelimitedVec
DelimitedVec of T delimited by : with P as policy for the last delimiter.
Comma
,
CommaDelimited
T followed by an optional ,
CommaDelimitedVec
DelimitedVec of T delimited by , with P as policy for the last delimiter.
Dollar
$
Dot
.
DotDelimited
T followed by an optional .
DotDelimitedVec
DelimitedVec of T delimited by . with P as policy for the last delimiter.
DotDot
..
DotDotEq
..=
DoubleSemicolon
Represents the double semicolon ‘::’ operator. ::
Ellipsis
...
Eq
Represents the ‘=’ operator. =
Equal
==
Exactly
Exactly N of T delimited by D or Nothing
FatArrow
=>
Ge
>=
Gt
>
InfixExpr
Generic infix operator expression.
LArrow
<-
Le
<=
LifetimeTick
' With Spacing::Joint
Lt
<
Many
One or more of T delimited by D or Nothing
Minus
-
MinusEq
-=
ModPath
Represents a module path, consisting of an optional path separator followed by a path-separator-delimited sequence of identifiers.
NonAssocExpr
Type alias for non-associative binary operators.
NotEqual
!=
Optional
Zero or one of T delimited by D or Nothing
Or
|
OrDefault
Tries to parse a T or inserts a D when that fails.
OrEq
|=
OrOr
||
PathSep
::
PathSepDelimited
T followed by an optional ::
PathSepDelimitedVec
DelimitedVec of T delimited by :: with P as policy for the last delimiter.
Percent
%
PercentEq
%=
Plus
+
PlusEq
+=
Pound
#
Question
?
RArrow
->
Repeats
DelimitedVec<T,D> with a minimum and maximum (inclusive) number of elements at first without defaults. Parsing will succeed when at least the minimum number of elements is reached and stop at the maximum number. The delimiter D defaults to Nothing to parse sequences which don’t have delimiters.
Replace
Parse-skip a T and inserts a U: Default in place. This is a zero sized type.
Result
Result type for parsing.
Semi
Represents the ‘;’ operator. ;
Semicolon
;
SemicolonDelimited
T followed by an optional ;
SemicolonDelimitedVec
DelimitedVec of T delimited by ; with P as policy for the last delimiter.
Shl
<<
ShlEq
<<=
Shr
>>
ShrEq
>>=
Slash
/
SlashEq
/=
Star
*
StarEq
*=
Tilde
~
TokenStreamUntil
Parses a TokenStream until, but excluding T. The presence of T is mandatory.
VerbatimUntil
Parses tokens and groups until C is found on the current token tree level.