elm-ast
Inspired by syn, elm-ast is a Rust library for parsing and constructing Elm 0.19.1 ASTs.
Overview
elm-ast provides a complete, strongly-typed representation of Elm source code as a Rust AST, along with a parser, printer, and visitor/fold traits for traversal and transformation. It is modeled after Rust's syn crate, with a formatting approach inspired by elm-format.
Tested against 291 real-world .elm files from 50 packages (including elm/core, elm/browser, rtfeldman/elm-css, mdgriffith/elm-ui, dillonkearns/elm-markdown, folkertdev/elm-flate, elm-explorations/test) with 100% parse, round-trip, and printer idempotency rates.
Quick start
Add the crate to your project:
Or add it to Cargo.toml directly:
[]
= "0.2"
Then:
use ;
let source = r#"
module Main exposing (..)
add : Int -> Int -> Int
add x y = x + y
"#;
// Parse
let module = parse?;
// Inspect
println!;
// Print back to valid Elm
let output = print;
Features
All features are enabled by default via full. Disable default-features and pick what you need to reduce compile times.
[]
= "0.2"
| Feature | Description |
|---|---|
full |
Enables all features below (default) |
parsing |
parse() and parse_recovering() functions |
printing |
print(), Display impls, Printer struct |
visit |
Visit trait for immutable AST traversal |
visit-mut |
VisitMut trait for in-place AST mutation |
fold |
Fold trait for owned AST transformation |
serde |
Serialize/Deserialize on all AST types |
wasm |
WASM bindings via wasm-bindgen |
Minimal dependency (AST types only)
[]
= { = "0.2", = false }
AST types
Every Elm 0.19.1 syntax construct has a corresponding Rust type:
| Elm construct | Rust type |
|---|---|
| Module header | ModuleHeader (Normal, Port, Effect) |
| Imports | Import |
| Exposing lists | Exposing, ExposedItem |
| Type annotations | TypeAnnotation (GenericType, Typed, Unit, Tupled, Record, GenericRecord, FunctionType) |
| Patterns | Pattern (Anything, Var, Literal, Tuple, Constructor, Record, Cons, List, As, ...) |
| Expressions | Expr (22 variants: literals, application, operators, if/case/let, lambda, records, lists, ...) |
| Declarations | Declaration (FunctionDeclaration, AliasDeclaration, CustomTypeDeclaration, PortDeclaration, InfixDeclaration) |
| Complete file | ElmModule |
All nodes carry source location information via Spanned<T>.
Parsing
use parse;
// Strict: fails on first error
let module = parse?;
// Recovering: returns partial AST + all errors
let = parse_recovering;
Printing
Three printer modes are available, trading off between minimal-change output and full elm-format agreement:
| Function | PrintStyle |
What it does |
|---|---|---|
print(&module) |
Compact (default) |
Round-trip-safe minimal line breaking. Only breaks lines for structurally multi-line sub-expressions (case/if/let/lambda). Guarantees print(parse(print(parse(src)))) == print(parse(src)) on elm-format-compliant input. |
pretty_print(&module) |
ElmFormat |
elm-format's style: pipelines always vertical, records and lists with 2+ entries multi-line, if-else always multi-line. Byte-for-byte matches elm-format(source) on real-world packages. |
pretty_print_converged(&module) |
ElmFormatConverged |
elm-format's style, but pre-converged to a stable formatting. elm-format is not fully idempotent on every input (see docs/printing.md); this mode emits the form elm-format would settle on after repeated passes, so re-running elm-format over the output is a no-op. |
use ;
let module = parse?;
let compact = print; // Compact: minimal breaking
let formatted = pretty_print; // ElmFormat: elm-format style
let stable = pretty_print_converged; // ElmFormatConverged: stable under re-formatting
// Or use Display for Compact output:
println!;
For custom indent width or reusable printers, construct one explicitly:
use ;
let output = new.print_module;
See docs/printing.md for the full breakdown of each mode, idempotency guarantees, and when to pick which.
Comment preservation
Top-level comments (line comments -- and block comments {- -} between declarations) are captured during parsing and round-tripped through the printer. Comments are placed immediately before the declaration they precede.
Comments inside let/in blocks and case/of branches are attached to their respective AST nodes and round-trip correctly. Doc comments ({-| -}) are attached to their declarations and always round-trip correctly.
Visitors
use ;
use Expr;
use Spanned;
;
let mut counter = FunctionCallCounter;
counter.visit_module;
println!;
Three traversal traits:
Visit: immutable traversal (&references)VisitMut: in-place mutation (&mutreferences)Fold: owned transformation (takes ownership, returns new tree)
Builder API
Construct AST nodes programmatically (with dummy spans):
use *;
let m = module;
println!; // prints valid Elm
Serde
With the serde feature, all AST types support JSON serialization:
let module = parse?;
let json = to_string_pretty?;
let module2: ElmModule = from_str?;
Architecture
For the full story, see the ARCHITECTURE.md doc.
The design follows syn's proven patterns:
- Enum-of-structs AST: each variant wraps a dedicated struct with named fields
Spanned<T>: every node carries aSpan(byte offset + line/column)Box<T>for recursive sub-expressions- Feature-gated modules for compile-time control
The printer uses an approach inspired by elm-format: eagerly detect whether sub-expressions are multi-line, then switch containers to vertical layout when any child is multi-line.
Fully iterative expression parser
The expression parser uses zero stack recursion. Traditional recursive-descent parsers can overflow the call stack on deeply nested input. elm-ast eliminates this entirely through three techniques:
- Iterative Pratt parsing: binary operators use an explicit
Vec<PendingOp>heap-allocated operator stack instead of recursive descent through precedence levels. - CPS (continuation-passing style): every compound expression (if/case/let/lambda/paren/tuple/list/record) that would normally call
parse_exprrecursively instead returns aNeedExpr(continuation)step, where the continuation is a closure capturing the partial parse state. - Trampoline loop: a top-level loop drives execution: when a compound form needs a sub-expression, its continuation is pushed onto a heap-allocated stack and the loop restarts. When a sub-expression completes, the continuation is popped and invoked.
This guarantees O(1) call-stack depth regardless of expression nesting. The continuation stack is bounded by MAX_EXPR_DEPTH (256) as a resource guard, not a safety requirement.
None of this was actually necessary. A simple depth limit would have sufficed, but it was fun to build, and, most importantly, it is thoroughly tested and works.
Test coverage
381 tests across lexer, parser, printer, visitors, property-based checks, and 291 real-world files from 50 packages. See TEST_COVERAGE.md for the full breakdown.
License
Dual licensed under Apache 2.0 or MIT.