Crate derive_syn_parse

source ·
Expand description

Derive macro for syn::parse::Parse

A common pattern when writing custom syn parsers is repeating <name>: input.parse()? for each field in the output. #[derive(Parse)] handles that for you, with some extra helpful customization.

§Usage

Using this crate is as simple as adding it to your ‘Cargo.toml’ and importing the derive macro:

# Cargo.toml

[dependencies]
derive-syn-parse = "0.2.0"
// your_file.rs
use derive_syn_parse::Parse;

#[derive(Parse)]
struct CustomParsable {
    // ...
}

The derived implementation of Parse always parses in the order that the fields are given. Note that deriving Parse is also available on enums. For more information, see the dedicated section.

This crate is intended for users who are already making heavy use of syn.

§Motivation

When writing rust code that makes heavy use of syn’s parsing functionality, we often end up writing things like:

use syn::parse::{Parse, ParseStream};
use syn::{Ident, Token, Type};

// A simplified struct field
//
//     x: i32
struct MyField {
    ident: Ident,
    colon_token: Token![:],
    ty: Type,
}

impl Parse for MyField {
    fn parse(input: ParseStream) -> syn::Result<Self> {
        Ok(MyField {
            ident: input.parse()?,
            colon_token: input.parse()?,
            ty: input.parse()?,
        })
    }
}

This is really repetitive! Ideally, we’d like to just #[derive(Parse)] and have it work. And so we can! (for the most part) Adding #[derive(Parse)] to the previous struct produces an equivalent implementation of Parse:

use syn::{Ident, Token, Type};
use derive_syn_parse::Parse;

#[derive(Parse)]
struct MyField {
    ident: Ident,
    colon_token: Token![:],
    ty: Type,
}

Of course, there are more complicated cases. This is mainly covered below in the ‘Advanced Usage’ section.

§Advanced Usage

There are a few different facilities provided here, including:

  • Enum variant parsing,
  • Conditional field parsing,
  • Parsing within token trees (parens, brackets, etc.),
  • And much more!

Each of the below sections can be expanded to view detailed information about how to use a particular component. Be warned - each section assumes a fair amount of knowledge about the relevant syn features.

Enum parsing

Parsing enums is a complex feature. When writing manual implementations of Parse, it doesn’t come up as often, but there are also typically many ways to do it: syn provides both forking the ParseBuffer and peeking to handle this, with the suggestion that peeking be preferred.

This library does not support forking; it tends to suffer from poor error messages and general inefficiency. That being said, manual implementations of Parse can and should be written in the rare cases when this library is insufficient.

We support peeking in a couple differnet ways - with the #[peek] and #[peek_with] attributes. One peeking attribute is required for each enum variant. The general syntax tends to look like:

#[peek($TYPE, name = $NAME)]

and

#[peek_with($EXPR, name = $NAME)]

The name is provided in order to construct useful error messages when input doesn’t match any of the variants.

These essentially translate to:

if input.peek($TYPE) {
    // parse variant
} else {
    // parse other variants
}

and

if ($EXPR)(input) {
    // parse variant
} else {
    // parse other variants
}
Token Trees (parens, brackets, braces)

If derive macros had access to type information, we could auto-detect when a field contains any of syn::token::{Paren, Bracket, Brace}. Unfortunately, we can’t - and these don’t implement Parse, so they each have their own special attribute to mark them: #[paren], #[bracket], and #[brace], respectively.

These are typically used with the #[inside] attribute, which indicates that a field should be parsed inside a particular named token tree. This might look like:

use derive_syn_parse::Parse;
use syn::{Ident, token, Expr};

// Parses a simple function call - something like
//
//   so_long(and_thanks + for_all * the_fish)
#[derive(Parse)]
struct SingleArgFn {
    name: Ident,
    #[paren]
    arg_paren: token::Paren,
    #[inside(arg_paren)]
    arg: Expr,
}

The #[inside] attribute can - of course - be repeated with multiple token trees, though this may not necessarily produce the most readable type definitions.

For reference, the above code produces an implementation equivalent to:


use syn::parse::{Parse, ParseStream};

impl Parse for SingleArgFn {
    fn parse(input: ParseStream) -> syn::Result<Self> {
        let paren;
        Ok(SingleArgFn {
            name: input.parse()?,
            arg_paren: syn::parenthesized!(paren in input),
            arg: paren.parse()?,
        })
    }
}
Custom parse functions (#[call], #[parse_terminated])

Not every type worth parsing implements Parse, but we still might want to parse them - things like Vec<Attribute> or any Punctuated<_, _> type. In these cases, the available attributes mirror the methods on ParseBuffer.

For #[parse_terminated], there aren’t any parameters that can be specified - it’s common enough that it’s provided for those Punctuated fields.

Alternatively, #[call] has the syntax #[call( EXPR )], where EXPR is any expression implementing FnOnce(ParseBuffer) -> syn::Result<T>. Typically, this might be something like:

use syn::{Attribute, Ident, Token};

// Parses a unit struct with attributes.
//
//     #[derive(Copy, Clone)]
//     struct S;
#[derive(Parse)]
struct UnitStruct {
    #[call(Attribute::parse_outer)]
    attrs: Vec<Attribute>,
    struct_token: Token![struct],
    name: Ident,
    semi_token: Token![;],
}

Unlike with ParseBuffer::call, which only accepts functions that are fn(ParseStream) -> syn::Result<T>, #[call] allows any expression that we can call with the ParseBuffer. So one could - hypothetically - implement #[parse_if] with this:

struct Foo {
    a: Option<Token![=>]>,
    #[call(|inp| match &a { Some(_) => Ok(Some(inp.parse()?)), None => Ok(None) })]
    b: Option<Bar>,
}

Though it’s probably best to just use #[parse_if] :)

Conditional field parsing (#[parse_if], #[peek])

When implementing Parse for structs, it is occasionally the case that certain fields are optional - or should only be parsed under certain circumstances. There are attributes for that!

Say we want to parse enums with the following, different syntax:

enum Foo {
    Bar: Baz,
    Qux,
}

where the equivalent Rust code would be:

enum Foo {
    Bar(Baz),
    Qux,
}

There’s two ways we could parse the variants here – either with a colon and following type or with no colon or type. To handle this, we can write:

#[derive(Parse)]
struct Variant {
    name: Ident,
    // `syn` already supports optional parsing of simple tokens
    colon: Option<Token![:]>,
    // We only want to parse the trailing type if there's a colon:
    #[parse_if(colon.is_some())]
    ty: Option<Type>,
}

Note that in this case, ty must be an Option. In addition to conditional parsing based on the values of what’s already been parsed, we can also peek - just as described above in the section on parsing enums. The only difference here is that we do not need to provide a name for the optional field. We could have equally implemented the above as:

#[derive(Parse)]
struct Variant {
    name: Ident,
    #[peek(Token![:])]
    ty: Option<VariantType>,
}

#[derive(Parse)]
struct VariantType {
    colon: Token![:],
    ty: Type,
}
Temporary parses: Prefix & postfix

A common pattern that sometimes occurs when deriving Parse implementations is to have many unused punctuation fields - imagine a hypothetical implementation of field parsing with default values:

// A field with default values, parsing something like:
//
//   foo: Bar = Bar::new()
#[derive(Parse)]
struct Field {
    ident: Ident,
    colon: Token![:],
    ty: Type,
    eq: Option<Token![=]>,
    #[parse_if(eq.is_some())]
    expr: Option<Expr>,
}

Here, there’s a couple fields that probably won’t be used later - both colon and eq. We can elimitate both of these with the #[prefix] attribute:

// A field with default values, parsing something like:
//
//   foo: Bar = Bar::new()
#[derive(Parse)]
struct Field {
    ident: Ident,
    #[prefix(Token![:])]
    ty: Type,
    #[prefix(Option<Token![=]> as eq)]
    #[parse_if(eq.is_some())]
    expr: Option<Expr>,
}

We can use "as <Ident>" to give a temporary name to the value - including it as a parsed value that can be referenced in other parsing clauses, but without adding it as a struct field.

There’s also a #[postfix] attribute, which operates very similarly to #[prefix], but exists to allow unused fields at the end of the struct. In general, #[postfix] tends to be pretty tricky to read, so it’s generally preferable to use #[prefix] to keep the field ordering the same as the parse order.

In some cases, we might want to have both a field and its prefix parsed inside some other token tree. Like the following contrived example:

use syn::*;

// Parses.... something. Who knows if this is useful... :P
//
//   (=> x + 2)
#[derive(Parse)]
struct Funky {
    #[paren]
    paren: token::Paren,
    #[inside(paren)]
    r_arrow: Token![=>],
    #[inside(paren)]
    expr: Expr,
}

To remove the unused r_arrow field here, we have an other extra piece we can add: "in" <Ident>".

#[derive(Parse)]
struct Funky {
    #[paren]
    paren: token::Paren,
    #[prefix(Token![=>] in paren)]
    #[inside(paren)]
    expr: Expr,
}

Note that attempting to write the #[inside] before #[prefix] is forbidden; it’s less clear what the expected behavior there should be.

Finally, when combining both "as" <ident> and "in" <ident>, they should come in that order - e.g. #[prefix(Foo as bar in baz)].

Derive Macros§