Crate wgsl_parser

source ·
Expand description

A hand-rolled, zero-copy recursive-descent parser for WebGPU shading language, written with Gramatika.

§Parsing a source file

use wgsl_parser::{Parse, ParseResult, ParseStream, ParseStreamer, SyntaxTree};

// const INPUT: &str = include_str!("path/to/some/shader.wgsl");

let mut parser = ParseStream::from(INPUT);
let tree = parser.parse::<SyntaxTree>();
assert!(tree.is_ok());

let ParseResult {
    source,
    tokens,
    comments,
    errors,
} = parser.into_inner();

assert_eq!(source.as_str(), INPUT);

§Tokenizing a source file without doing a full parse

use wgsl_parser::{gramatika::Lexer as _, Lexer};

// const INPUT: &str = include_str!("path/to/some/shader.wgsl");

let mut lexer = Lexer::new(INPUT.into());
let _tokens = lexer.scan();

§Syntax tree representation

A SyntaxTree contains a vector of Decls representing the top-level syntax types defined by the WGSL grammar, e.g.:

  • Decl::Var(VarDecl { .. })

    @group(1) @binding(2)
    var<uniform> uniforms: Uniforms;
    
  • Decl::Const(VarDecl { .. })

    const FOO: u32 = 1u;
    
  • Decl::Struct(StructDecl { .. })

    struct Foo {
        foo: mat3x4<f32>,
        bar: vec2<u32>,
        baz: array<mat4x4<f32>, 256u>,
    }
    
  • Decl::Function(FunctionDecl { .. })

    fn sum(a: f32, b: f32) -> f32 {
        return a + b;
    }
    

The structures wrapped by those declarations can contain sub-declarations, e.g.:

The body of a FunctionDecl contains a vector of Stmts.

Stmt is an enum in a form similar to Decl, with variants indicating the kind of statement it represents, each wrapping an inner structure that describes the syntax in further detail, often recursively, e.g.:

Stmt::If(IfStmt {
  ..
  else_branch: Some(ElseStmt {
    ..
    body: Arc(Stmt::Block(BlockStmt {
      ..
      stmts: Arc<[Stmt]>,
    })),
  }),
})

Finally, Expr is the “lowest” type of syntax node in the tree, taking the same general form as Decl and Stmt above.

§Inspecting a syntax tree

Each node of the syntax tree derives a bespoke Debug implementation, which prints the tree in a format that’s a sort of cross between Lisp (a format commonly used for representing syntax trees) and Rust syntax.

That format looks like this:

max(4, 2) // The expression represented by the tree below
(Expr::Primary (PrimaryExpr
  expr: (Expr::FnCall (FnCallExpr
    ident: (IdentExpr::Leaf `max` (Function (1:1...1:4))),
    arguments: (ArgumentList
      brace_open: `(` (Brace (1:4...1:5)),
      arguments: [
        (Expr::Primary (PrimaryExpr
          expr: (Expr::Literal `4` (IntLiteral (1:5...1:6))),
        )),
        (Expr::Primary (PrimaryExpr
          expr: (Expr::Literal `2` (IntLiteral (1:8...1:9))),
        )),
      ],
      brace_close: `)` (Brace (1:9...1:10)),
    ),
  )),
))

§Traversing a syntax tree

The package exports a Visitor trait which can be implemented to efficiently traverse the tree. Visitor defines a visit_* method for each type of syntax represented by the tree. visit_* methods for nodes that contain child nodes must return either FlowControl::Continue to traverse their children, or FlowControl::Break to stop traversing the current branch.

The default Visitor implementation returns FlowControl::Continue for every node, so you only need to actually implement the visit_* methods that your particular use case calls for:

use std::collections::HashMap;

use wgsl_parser::{
    decl::VarDecl,
    expr::{IdentExpr, NamespacedIdent},
    gramatika::{ParseStreamer, Substr, Token as _},
    traversal::{FlowControl, Visitor, Walk},
    ParseStream, SyntaxTree,
};

// Note: Not actually a robust implementation of a reference-counter,
//       but good enough for this toy example
#[derive(Default)]
struct ReferenceCounter {
    counts: HashMap<Substr, usize>,
}

impl Visitor for ReferenceCounter {
    fn visit_var_decl(&mut self, decl: &VarDecl) -> FlowControl {
        // Create an entry in the map for the identifier being declared
        self.counts.insert(decl.name.lexeme(), 0);

        // The expression being assigned to the new variable could include
        // references to other variables, so we'll call `expr.walk(self)` to
        // make sure our visitor sees those identifiers as well.
        if let Some(ref expr) = decl.assignment {
            expr.walk(self);
        }

        // We could have returned `FlowControl::Continue` _instead_ of
        // explicitly stepping into the assignment expression above, but
        // since we don't really care about any other child nodes of the
        // `VarDecl`, this lets us skip some extra work.
        FlowControl::Break
    }

    fn visit_ident_expr(&mut self, mut expr: &IdentExpr) {
        // Find the count in our map for this identifier and increment it
        if let IdentExpr::Leaf(name) = expr {
            if let Some(count) = self.counts.get_mut(&name.lexeme()) {
                *count += 1;
            }
        }
    }
}

let input = r#"
fn main() {
    var a: i32 = 4;
    let b = a;
    let c = 2;

    do_something(a, c);
}
"#;

let tree = ParseStream::from(input).parse::<SyntaxTree>()?;
let mut ref_counter = ReferenceCounter::default();
tree.walk(&mut ref_counter);

assert_eq!(ref_counter.counts["a"], 2);
assert_eq!(ref_counter.counts["b"], 0);
assert_eq!(ref_counter.counts["c"], 1);

Re-exports§

Modules§

  • A module for syntax nodes that can appear in many different parts of a program, like attributes and type annotations.
  • A module for the syntax nodes representing top-level WGSL declarations.
  • A module for the syntax nodes representing WGSL expressions.
  • Display and Debug implementations for WGSL syntax nodes.
  • This module is responsible for building a tree of scopes from a SyntaxTree.
  • A module for the syntax nodes representing WGSL statements.
  • Defines the lexical Tokens of the WGSL grammar, and an auto-generated Lexer.
  • Defines the Visitor and Walk traits for traversing a WGSL syntax tree.
  • Miscellaneous helpers for working with WGSL syntax trees.

Structs§

Traits§

  • A trait to be implemented by any type that can be parsed using the ParseStreamer interface.
  • A user-friendly interface for implementing a hand-written LL(1) or recursive descent parser with backtracking.

Type Aliases§