toml-spanner
High-performance, fast compiling, span preserving conformant TOML parsing for Rust.
Originally forked from toml-span to add TOML 1.1.0 support, toml-spanner
has received significant performance improvements and reductions in compile time.
Unlike the original, toml-spanner aims to be a fully compliant TOML v1.1.0 parser, including
full date-time support, with conformance verified by extensive fuzzing against the toml crate
and passing the official TOML decoding test suite.
Example
Suppose you have some TOML document declared in TOML_DOCUMENT as a &str:
= false
= 37
[[]]
= 43
[[]]
= true
= 12
Then you can parse the TOML document into an Item tree, with the following:
use ;
let arena = new;
let mut root = parse.unwrap;
You can traverse the tree and inspect values:
assert_eq!;
match root.value
When the deserialize feature is enabled, toml-spanner provides a set of
helpers and a trait to aid in deserializing Item trees into user defined types.
use ;
if let Ok = root. else
Please consult the API documentation for more details.
Benchmarks
Measured on AMD Ryzen 9 5950X, 64GB RAM, Linux 6.18, rustc 1.93.0. Relative parse time across real-world TOML files (lower is better):
Crate Versions: toml-spanner = 0.4.0, toml = 1.0.3+spec-1.1.0, toml-span = 0.7.0
time(μs) cycles(K) instr(K) branch(K)
zed/Cargo.toml
toml-spanner 24.6 116 439 91
toml 255.4 1212 3088 608
toml-span 389.3 1821 5049 1047
extask.toml
toml-spanner 8.8 41 149 28
toml 78.8 372 1005 192
toml-span 105.1 492 1337 264
devsm.toml
toml-spanner 3.7 17 69 14
toml 32.6 155 424 81
toml-span 56.8 266 711 141
This runtime benchmark is pretty simple and focuses just on the parsing step. In practice,
if you also deserialize into your own data types (where toml-spanner has only made marginal
improvements), the total runtime improvement is less, but it is highly dependent on the content
and target data types. Switching devsm from toml-span to toml-spanner saw a total 8x reduction
in runtime measured from the actual application when including both parsing and deserialization.
Compile Time
Extra cargo build --release time for binaries using the respective crates (lower is better):
median(ms) added(ms)
null 108
toml-spanner 739 +631
toml-span 1378 +1270
toml 3060 +2952
toml+serde 5156 +5048
Check out ./benchmark for more details, but numbers should simulate the additional
time added users would experience during source based installs such as via cargo install.
Divergence from toml-span
While toml-spanner started as a fork of toml-span, it has since undergone
extensive changes:
-
10x faster than
toml-span, and 5-8x faster thantomlacross real-world workloads. -
Preserved index order: tables retain their insertion order by default, unlike
toml_spanand the default mode oftoml. -
Compact
Valuetype (on 64bit platforms):Crate Value/Item TableEntry toml-spanner 24 bytes 48 bytes toml-span 48 bytes 88 bytes toml 32 bytes 56 bytes toml (preserve_order) 80 bytes 104 bytes Note that the
tomlcrateValuetype doesn't contain any span information and thattoml-spandoesn't support table entry order preservation.
Trade-offs
toml-spanner makes extensive use of unsafe code to achieve its performance
and size goals. This is mitigated by fuzzing and running the test suite under
Miri.
Testing
The unsafe in this crate demands thorough testing. The full suite includes
Miri for detecting undefined behavior,
fuzzing against the reference toml crate, and snapshot-based integration
tests.
# Test 32bit support under MIRI
Integration tests use insta for snapshot assertions.
Run cargo insta test -p snapshot-tests and cargo insta review to review
changes.
Code coverage:
Differences from toml
First off I just want to be up front and clear about the differences/limitations of this crate versus toml
- No
serdesupport for deserialization, there is aserdefeature, but that only enables serialization of theValueandSpannedtypes. - No toml serialization. This crate is only intended to be a span preserving deserializer, there is no intention to provide serialization to toml, especially the advanced format preserving kind provided by
toml-edit.
License
This contribution is dual licensed under EITHER OF
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.