granit-parser
“YAML is hard. Much more than I had anticipated. If you are exploring dark corners of YAML ... I'm curious to know what it is.”
— Ethiraric
granit-parser is both YAML 1.1 and 1.2 compliant parser in pure Rust with strict compliance, no-std support, and spans for parser events.
This crate started as a fork of saphyr-parser that descends from yaml-rust, with influences from libyaml and yaml-cpp. The project has since diverged significantly and is now maintained as an independent project.
Its primary goals are:
- full compliance with the yaml-test-suite, including correctness in edge cases
- compatibility with real-world YAML usage
- quickly incorporate the changes we need for the upstream dependency serde-saphyr
granit-parser’s public API is very similar to that of saphyr-parser, so it is typically an easy replacement. However, some changes are still breaking (crate rename, MSRV bump, lifetimes on events, Cow payloads, etc.).
See releases
Minimal example
Parser::new_from_str returns an iterator of (Event, Span) pairs. If you only care about parser events, you can ignore the span and match on the emitted Event values:
use ;
This prints an event stream like:
StreamStart
DocumentStart(false)
MappingStart(0, None)
Scalar("items", Plain, 0, None)
sequence tag: !shopping
SequenceStart(0, Some(Tag { handle: "!", suffix: "shopping" }))
Scalar("milk", Plain, 0, None)
scalar tag: tag:yaml.org,2002:str for "bread"
Scalar("bread", Plain, 0, Some(Tag { handle: "tag:yaml.org,2002:", suffix: "str" }))
SequenceEnd
Scalar("locations", Plain, 0, None)
MappingStart(0, None)
SequenceStart(0, None)
Scalar("47.3769", Plain, 0, None)
Scalar("8.5417", Plain, 0, None)
SequenceEnd
Scalar("local", Plain, 0, None)
SequenceStart(0, None)
Scalar("40.7128", Plain, 0, None)
Scalar("-74.0060", Plain, 0, None)
SequenceEnd
Scalar("remote", Plain, 0, None)
MappingEnd
Scalar("music", Plain, 0, None)
Scalar("𝄞🎵🎶", DoubleQuoted, 0, None)
MappingEnd
DocumentEnd
StreamEnd
Key differences from saphyr-parser
All changes are intentionally scoped around correctness, compliance, and interoperability.
YAML compliance fixes
-
Invalid extra closing brackets are rejected
] -
Comments no longer truncate multiline scalars
word1 # comment word2This is now correctly treated as invalid YAML instead of silently discarding content.
-
Reserved directives are ignored
Previously reported as errors; now handled according to the YAML specification.
Compatibility adjustment
-
Relaxed indentation for closing brackets
key:While not strictly YAML-compliant, this form is accepted for compatibility with other parsers and real-world inputs.
JSON-style Unicode surrogate pairs
This parser supports explicit handling for JSON-style Unicode surrogate pairs in quoted scalar escape sequences.
\uXXXXescapes that encode a high surrogate are now required to be followed immediately by a valid low surrogate escape, and both escapes are combined into the corresponding Unicode scalar value.- Unpaired high surrogates, unpaired low surrogates, and reversed surrogate pairs are now rejected during scanning instead of being treated as generic invalid Unicode escape codes.
Parsing correctness improvements
- Plain scalar continuation fixed
Supports cases like:
hello:
world: this is a string
--- still a string
-
More helpful error reporting
Mismatched brackets and quotes now report the position of the opening token instead of the end of file.
Performance improvements
-
Zero-copy parsing for
&strinputUses
Cow<'input, str>to avoid unnecessary allocations when parsing from in-memory strings.
Internal extensions
-
Parser stack support
Enables features such as
!includeby exposing additional internal capabilities.
Security
This crate includes fixes to improve resilience against:
- denial-of-service inputs
- parser hangs
- panic conditions
Like the upstream parser, it does not interpret application-level types, so parsing YAML does not trigger external side effects.
Tools
The repository includes a few developer tools for inspecting parser output and measuring performance.
Root package binaries:
dump_eventsprints the parser event stream for a YAML file.time_parsermeasures one full parse and discards the events.run_parserruns repeated parses and reports aggregate timings.
Standalone helper crates:
walkopens a small REPL for navigating parsed YAML spans.bench_comparecompares benchmark output from multiple parsers.gen_large_yamlgenerates large YAML inputs for benchmark work.
See tools/README.md and tools/bench_compare/README.md for the longer tool notes.
License
Licensed under either:
- Apache License, Version 2.0
- MIT license
At your option.
This project inherits licensing terms from its upstream origins. See the LICENSE file and .licenses/ directory for details.