toml-spanner
High-performance, fast compiling, TOML serialization and deserialization library for rust with full compliance with the TOML 1.1 spec.
toml-spanner is a complete TOML library featuring:
- High Performance: See Benchmarks
- Fast (Increment & Clean) Compilation: See Compile Time Benchmarks
- Compact Span Preserving Tree
- Derive macros: optional, powerful, zero-dependency: See Derive Documentation
- Format Preserving Serialization, even through mutation on your own data types.
- Full TOML 1.1, including date-time support, passing 100% of official TOML test-suite
- Tiny Binary Size: See Binary Size Benchmarks
- Extensively tested with miri and fuzzing under memory sanitizers and debug assertions.
- High quality error messages: See Error Examples
Example
Suppose you have some TOML document declared in TOML_DOCUMENT as a &str:
= false
= 37
[[]]
= 43
[[]]
= true
= 12
Parse the TOML document into an Item tree:
let arena = new;
let doc = parse.unwrap;
Traverse the tree and inspect values:
assert_eq!;
match doc.value
Derive Macros
The Toml derive macro generates FromToml and/or ToToml implementations.
use ;
// By default only `FromToml` is derived.
let arena = new;
let mut doc = parse.unwrap;
let config = doc..unwrap;
See the Toml derive docs
for the full set of attributes (rename, default, flatten, skip, tagged enums, etc.).
Manual FromToml
Implement FromToml directly using TableHelper for type-safe field extraction.
use ;
let arena = new;
let mut doc = parse.unwrap;
match doc.
Serialization
Any type implementing ToToml (including via derive) can be written to TOML
with to_string or the Formatting builder for format-preserving output.
// Use default formatting.
let output = to_string.unwrap;
// Preserve formatting from a parsed document
let output = preserved_from.format.unwrap;
See Formatting docs
for indentation, format preservation, and other options.
Please consult the API documentation for more details.
Benchmarks
Measured on AMD Ryzen 9 5950X, 64GB RAM, Linux 6.18, rustc 1.94.1. Relative parse time across real-world TOML files (lower is better):
Crate versions: toml-spanner 1.0.0, toml 1.0.7+spec-1.1.0, toml_edit 0.25.5+spec-1.1.0, toml-span 0.7.1
time(μs) cycles(K) instr(K) branch(K)
zed/Cargo.toml
toml-spanner 24.5 115 440 93
toml 228.5 1088 2912 523
toml_edit 306.6 1460 4252 861
toml-span 393.8 1866 5024 1045
extask.toml
toml-spanner 8.9 43 149 29
toml 78.5 374 1031 177
toml_edit 106.7 505 1470 290
toml-span 105.8 500 1331 263
devsm.toml
toml-spanner 3.7 17 70 15
toml 35.8 171 459 79
toml_edit 48.7 232 650 127
toml-span 56.4 269 708 140
This runtime benchmark is pretty simple and focuses just on the parsing step. In practice,
if you also deserialize into your own data types (where toml-spanner has only made marginal
improvements), the total runtime improvement is less, but it is highly dependent on the content
and target data types. Switching devsm from toml-span to toml-spanner saw a total 8x reduction
in runtime measured from the actual application when including both parsing and deserialization.
Deserialization and Parsing
Usually, you don't just parse TOML, toml-spanner derive macros for full deserialization.
The following benchmarks have taken the exact data structures and deserialization code (originally
using toml and serde), and added support for toml-spanner and toml-span based parsing and
deserialization. (I haven't added toml-span support for Cargo.toml due to its complexity.)
Crate versions: toml-spanner = 1.0.0, toml = 1.0.7+spec-1.1.0, toml-span = 0.7.1
Commit 3ca292befbc3585084922c1592ea3d17e423f035 was used from rust-lang/cargo as reference.
time(μs) cycles(K) instr(K) branch(K)
zed/Cargo.lock (parse + deserialize)
toml-spanner 1023.5 4803 16135 3514
toml 2977.4 14248 37270 7296
toml-span 5643.2 26831 74584 15460
zed/Cargo.toml (parse + deserialize)
toml-spanner 92.5 439 1405 283
toml 309.4 1475 3622 662
Compile Time
For a crate serializing and deserializing a simplified cargo manifest using the derive macro for each crate. With unrestricted parallelism we get the following:
See Compile Time Benchmarks for more details.
Divergence from toml-span
While toml-spanner started as a fork of toml-span, it's diverged a lot:
-
7x faster than
toml-span, and 2-3x faster thantomlon end-2-end benchmarks. -
Preserved index order: tables retain their insertion order by default, unlike
toml_spanand the default mode oftoml. -
Compact
Valuetype (on 64bit platforms):Crate Value/Item TableEntry toml-spanner 24 bytes 48 bytes toml-span 48 bytes 88 bytes toml 32 bytes 56 bytes toml (preserve_order) 80 bytes 104 bytes Note that the
tomlcrateValuetype doesn't contain any span information and thattoml-spandoesn't support table entry order preservation. -
Full TOML v1.1.0 compliance including date-time support
-
Native derive macro
-
Format-preserving serialization
Error Examples
Toml-spanner provides specific errors with spans and paths pointing directly to the problem, multi-error accumulation, and methods for easy use with annotate-snippets and codespan-reporting.
Here are some parsing examples using the annotated-snippets feature:
Here are some conversion errors, note how multiple errors are reported instead of bailing out after the first error.
Trade-offs
toml-spanner makes extensive use of unsafe code to achieve its performance
and size goals. This is mitigated by fuzzing and running the test suite under
Miri.
Testing
The unsafe in this crate demands thorough testing. The full suite includes
Miri for detecting undefined behavior,
fuzzing against the reference toml crate, and snapshot-based integration
tests.
# Test 32bit support under MIRI
Integration tests use insta for snapshot assertions.
Run cargo insta test -p snapshot-tests and cargo insta review to review
changes.
Code coverage:
Note: See the devsm.toml file in the root for typical commands that are run during development.
Acknowledgements
toml-spanner started off as a fork of toml-span and though it's been pretty much completely rewritten at this point, the original test suite and some of the API patterns remain.
Thanks to both the toml and toml-edit crates inspired the API as well as the error messages as well as serving targets to fuzz against.
License
This contribution is dual licensed under EITHER OF
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.