granit-parser
“YAML is hard. Much more than I had anticipated. If you are exploring dark corners of YAML ... I'm curious to know what it is.”
granit-parser is both YAML 1.1 and 1.2 compliant parser in pure Rust with strict compliance, comment and style support, no-std support, and spans for parser events. “Granit” is a correct word in many European languages (English granite).
This crate started as a fork of saphyr-parser that descends from yaml-rust, with influences from libyaml and yaml-cpp. The project has since diverged significantly and is now maintained as an independent project.
Its primary goals are:
Commentsupport andStructureStyleinformation. This is for linting, formatting, and analysis.- compliance with the yaml-test-suite, including correctness in edge cases
- compatibility with real-world YAML usage
- quickly incorporate the changes we need for the upstream dependency serde-saphyr.
granit-parser’s 0.0.1 or 0.0.2 public API is very similar to that of saphyr-parser, so it is typically an easy replacement. Later versions emit style and comment information, you need to adjust your code to handle or discard them.
See releases
Minimal example
Parser::new_from_str returns an iterator of (Event, Span) pairs. The event helpers expose common node metadata, and spans provide byte ranges plus source slices:
Comments are emitted as Event::Comment(text, placement). They are presentation metadata for tools such as linters and formatters, not YAML data nodes, so consumers that build YAML values should filter them out. The companion Span for a comment covers the whole source comment, including # and excluding the line break; when parsing from Parser::new_from_str, span.slice(yaml) returns that source comment text.
use Parser;
This prints an event stream like:
StreamStart bytes=Some(0..0) source=Some("")
DocumentStart(false) bytes=Some(1..1) source=Some("")
MappingStart(Block, 0, None) bytes=Some(1..1) source=Some("")
Scalar("items", Plain, 0, None) bytes=Some(1..6) source=Some("items")
node tag: !shopping custom=true
SequenceStart(Block, 0, Some(Tag { handle: "!", suffix: "shopping" })) bytes=Some(20..20) source=Some("")
Scalar("milk", Plain, 0, None) bytes=Some(22..26) source=Some("milk")
scalar tag: tag:yaml.org,2002:str core-str=true for "bread"
Scalar("bread", Plain, 0, Some(Tag { handle: "tag:yaml.org,2002:", suffix: "str" })) bytes=Some(37..42) source=Some("bread")
SequenceEnd bytes=Some(43..43) source=Some("")
Scalar("locations", Plain, 0, None) bytes=Some(43..52) source=Some("locations")
Comment(" Example with composite keys", Right) bytes=Some(54..83) source=Some("# Example with composite keys")
MappingStart(Block, 0, None) bytes=Some(86..86) source=Some("")
SequenceStart(Flow, 0, None) bytes=Some(86..87) source=Some("[")
Scalar("47.3769", Plain, 0, None) bytes=Some(87..94) source=Some("47.3769")
Scalar("8.5417", Plain, 0, None) bytes=Some(96..102) source=Some("8.5417")
SequenceEnd bytes=Some(102..103) source=Some("]")
Scalar("local", Plain, 0, None) bytes=Some(105..110) source=Some("local")
SequenceStart(Flow, 0, None) bytes=Some(113..114) source=Some("[")
Scalar("40.7128", Plain, 0, None) bytes=Some(114..121) source=Some("40.7128")
Scalar("-74.0060", Plain, 0, None) bytes=Some(123..131) source=Some("-74.0060")
SequenceEnd bytes=Some(131..132) source=Some("]")
Scalar("remote", Plain, 0, None) bytes=Some(134..140) source=Some("remote")
Comment(" JSON-style \\uXXXX surrogate pairs:", Above) bytes=Some(142..178) source=Some("# JSON-style \\uXXXX surrogate pairs:")
MappingEnd bytes=Some(179..179) source=Some("")
Scalar("music", Plain, 0, None) bytes=Some(179..184) source=Some("music")
Scalar("𝄞🎵🎶", DoubleQuoted, 0, None) bytes=Some(186..224) source=Some("\"\\uD834\\uDD1E\\uD83C\\uDFB5\\uD83C\\uDFB6\"")
MappingEnd bytes=Some(225..225) source=Some("")
DocumentEnd bytes=Some(225..225) source=Some("")
StreamEnd bytes=Some(225..225) source=Some("")
Event API choices
Use try_load
when a receiver may return a validation or application error and parsing should
stop immediately. It accepts
TryEventReceiver
or
TrySpannedEventReceiver
and returns
TryLoadError to
distinguish parser errors from receiver errors.
Event-only receivers receive comment events as Event::Comment(text, placement).
Spanned receivers receive the same event plus the comment span in
on_event.
When using resolve
or push_include
on ParserStack, comment events
from included documents are forwarded through the normal event stream. Their
spans remain local to the included source, matching the existing span behavior
for other included-document events.
Use the iterator API when the caller should pull events and decide when to stop
parsing. load
is infallible.
Key differences from saphyr-parser
All changes are intentionally scoped around correctness, compliance, and interoperability.
YAML compliance fixes
-
Invalid extra closing brackets are rejected
] -
Comments no longer truncate multiline scalars
word1 # comment word2This is now correctly treated as invalid YAML instead of silently discarding content.
-
Reserved directives are ignored
Previously reported as errors; now handled according to the YAML specification.
Compatibility adjustment
-
Relaxed indentation for closing brackets
key:While not strictly YAML-compliant, this form is accepted for compatibility with other parsers and real-world inputs.
JSON-style Unicode surrogate pairs
This parser supports explicit handling for JSON-style Unicode surrogate pairs in quoted scalar escape sequences.
\uXXXXescapes that encode a high surrogate are now required to be followed immediately by a valid low surrogate escape, and both escapes are combined into the corresponding Unicode scalar value.- Unpaired high surrogates, unpaired low surrogates, and reversed surrogate pairs are now rejected during scanning instead of being treated as generic invalid Unicode escape codes.
Parsing correctness improvements
- Plain scalar continuation fixed
Supports cases like:
hello:
world: this is a string
--- still a string
-
More helpful error reporting
Mismatched brackets and quotes now report the position of the opening token instead of the end of file.
Performance improvements
-
Zero-copy parsing for
&strinputUses
Cow<'input, str>to avoid unnecessary allocations when parsing from in-memory strings.
Internal extensions
-
Parser stack support
Enables features such as
!includeby exposing additional internal capabilities.
Security
This crate includes fixes to improve resilience against:
- denial-of-service inputs
- parser hangs
- panic conditions
Like the upstream parser, it does not interpret application-level types, so parsing YAML does not trigger external side effects.
Improved ergonomics
Release 0.0.3 includes ergonomic helpers such as Event::tag, Event::scalar,
Event::anchor_id, Event::alias_id, Event::is_node, Tag::parts,
Tag::is_custom, Tag::is_yaml_core_schema_tag, Span::slice, and
ParserStack::push_include. See CHANGELOG.md for details.
Tools
The repository includes a few developer tools for inspecting parser output and measuring performance.
Root package binaries:
dump_eventsprints the parser event stream for a YAML file.time_parsermeasures one full parse and discards the events.run_parserruns repeated parses and reports aggregate timings.
Standalone helper crates:
walkopens a small REPL for navigating parsed YAML spans.bench_comparecompares benchmark output from multiple parsers.gen_large_yamlgenerates large YAML inputs for benchmark work.
See tools/README.md and tools/bench_compare/README.md for the longer tool notes.
License
Licensed under either:
- Apache License, Version 2.0
- MIT license
At your option.
This project inherits licensing terms from its upstream origins. See the LICENSE file and .licenses/ directory for details.