unsynn (from german 'unsinn' for nonsense) is a minimalist rust parser library. It achieves this by leaving out the actual grammar implementations which are implemented in distinct crates. Still it comes with batteries included, there are parsers, combinators and transformers to solve most parsing tasks.
In exchange it offers simple composeable Parsers and declarative Parser construction. Grammars will be implemented in their own crates (see unsynn-rust).
It is primarily intended use is when one wants to create proc macros for rust that define their own grammar or need only sparse rust parsers.
Other uses can be building parsers for gramars outside a rust/proc-macro context. Unsynn can
parse any &str data (The tokenizer step relies on proc_macro2).
Examples
Creating and Parsing Custom Types
The unsynn!{} macro generates the [Parser] and [ToTokens] implementations for your types.
Notice that unsynn implements [Parser] and [ToTokens] for many standard rust types. Like
we use u32 in this example.
# use *;
let mut token_iter = "foo ( 1, 2, 3 )".to_token_iter;
unsynn!
// iter.parse() is from the IParse trait
let ast: IdentThenParenthesisedNumbers = token_iter.parse.unwrap;
assert_tokens_eq!;
In case the automatiically generated [Parser] and [ToTokens] implementations are not
sufficient the macro supports custom parsing and token emission through parse_with and
to_tokens clauses:
from Type:: Parse from a different type before transformation (requiresparse_with)parse_with: Transform or validate values during parsing (used alone for validation, withfromfor transformation)to_tokens: Customize how types are emitted back to tokens (independent)
The parse_with and to_tokens clauses are independent and optional. The from clause must be used together with parse_with. When more control is needed, the implementations can also be written manually.
Custom Parsing and ToTokens
Example of custom parsing and token emission - parse emoji as bools, emit as emoji:
# use *;
unsynn!
See the COOKBOOK for more details on parse_with
and to_tokens clauses.
Using Composition
Composition can be used without defining new datatypes. This is useful for simple parsers or
when one wants to parse things on the fly which are desconstructed immediately. See the
combinator module for more composition types.
# use *;
// We parse this below
let mut token_iter = "foo ( 1, 2, 3 )".to_token_iter;
// Type::parse() is from the Parse trait
let ast =
parse.unwrap;
assert_tokens_eq!;
Custom Operators and Keywords
Keywords and operators can be defined within the unsynn!{} macro:
# use *;
unsynn!
// Build expression parser with proper precedence
type Expression = ;
type AdditiveExpr = ;
type MultiplicativeExpr = ;
let ast = "CALC 2*3+4*5 ;".to_token_iter
..expect;
Keywords and operators can also be defined using standalone keyword!{} and operator!{} macros.
See the operator names reference for predefined operators.
For more details on building expression parsers with proper precedence and associativity,
see the expressions module documentation.
Feature Flags
-
proc_macro2:
Controls whether unsynn uses theproc_macro2crate or the built-inproc_macrocrate for token handling. This is enabled by default. When enabled, unsynn can parse from strings (via&str::to_token_iter()), convert tokens to strings (viatokens_to_string()), and be used in any context (tests, examples, etc.). When disabled, unsynn uses only the built-inproc_macrocrate and can only be used from proc-macro crates (withproc-macro = truein Cargo.toml). This creates leaner proc macros without theproc_macro2dependency.APIs disabled without
proc_macro2:- String parsing: [
ToTokens] for &str/String (parsing strings into tokens) - Format macros:
format_ident!(),format_literal!(),format_literal_string!() - Transform types:
IntoIdent<T>(requires string parsing for validation) - Test helper:
assert_tokens_eq!()(requires string parsing) - String-based constructors:
Cached::new(),Cached::from_string()(require string parsing)
APIs that remain available:
- Token to string conversion:
tokens_to_string(),to_token_iter(),into_token_iter() - All parsing functionality (works with [
TokenStream] from proc macro input) - All [
ToTokens] implementations (except for &str/String) - Transform types:
IntoLiteralString<T>(usesLiteral::string()constructor) - Type:
Cached<T>(but not the string-based constructors)
- String parsing: [
-
hash_keywords:
This enables hash tables for larger keyword groups. This is enabled by default since it guarantees fast lookup in all use-cases and the extra dependency it introduces is very small. Nevertheless this feature can be disabled when keyword grouping is not or rarely used to remove the dependency onrust_hash. Keyword lookups then fall back to a binary search implementation. Note that the implementation already optimizes the cases where only one or only a few keywords are in a group. -
criterion:
Enables thecriterionbenchmarking framework for performance benchmarks. This is disabled by default to keep the dependency tree light. Usecargo bench --features criterionto run the criterion benchmarks. Without this feature, only non-criterion benchmarks will run. -
docgen:
Theunsynn!{},keyword!{}andoperator!{}macros will automatically generate some additional docs. This is enabled by default. -
nonparsable:
This enables the implementation of [Parser] and [ToTokens] for theNonParseabletype. When not set, any use of it will result in a compile error. One may disable this for release builds to prevent anyNonParsableleft used in the code, thus checking for completeness (NonParseableis used for marking unimplemented types) and avoiding potential panics at runtime. This is enabled by default, consider to disable it in release builds. -
debug_grammar:
Enables theStderrLog<T, N>debug type that prints type information and token sequences to stderr during parsing. This is useful for debugging complex grammars and understanding parser behavior. When disabled (the default),StderrLogbecomes is zero-cost/no-op. This is disabled by default. Enable it during development withcargo test --features debug_grammarorcargo build --features debug_grammar. See the COOKBOOK for usage examples. -
trait_methods_track_caller:
Adds#[track_caller]to [Parse], [Parser], [IParse] and [ToTokens] trait methods. The idea here is to make unsynn more transparent in case of a panic and point closer to the users code that caused the problem. This has a neglible performance impact and is a experimental feature. When it has some bad side effects, please report it. This is enabled by default. -
extra_asserts:
Enables expensive runtime sanity checks for unsynn internals. Enabled while developing unsynn. This adds diagnostics to datastructures and makes unsynn slower and bigger. Should be disabled when unsynn is used by another crate. Currently enabled by default which may (and eventually will) be disabled for stable releases. -
extra_tests:
Enable expensive tests that check semantics that should taken 'for granted', will make the testsuite slower. Even without these tests enabled we aim for full (cargo-mutants) test coverage withextra_assertsenabled. This is disabled by default. Many of these tests are kept from development to assert correct semantics but are covered elsewhere.