Regular subset of TOML
A fast, streaming TOML parser for the regular subset of TOML v1.0.0. Available in F# and Rust.
- Streaming — calls your callback during parsing, no intermediate tree
- Zero-copy — values are byte spans into the original input
- Stackless — no recursion, no stack-allocated collections
- Automata-based — DFA-driven lexer, optimal and predictable (F# benchmarks, Rust benchmarks)
- Inlined — lambdas inline at the call site, no vtables
- Single file, no dependencies — drop it in and go
- Raw UTF-8 — runs on bytes directly, no char conversion
F# benchmarks

Rust benchmarks

What is this?
TOML's nested types ([table], [[array]], dotted keys) happen to be expressible as a regular grammar. r-toml exploits this to parse TOML with a flat DFA instead of a recursive descent parser. The trade-off: some rarely-used TOML features aren't supported (see below). For typical config files and data storage, it's fully compatible with TOML and much faster than general-purpose parsers.
basic usage (F#)
let toml : byte[] = "
[server]
port = 8080
hostname = 'abc'
"B
let dictionary = RToml.toDictionary(toml)
dictionary["server.port"].kind // INT
dictionary["server.port"].ToInt(toml) // 8080
// or any of the other formats
let array = RToml.toArray(toml)
let array2 = RToml.toStructArray(toml)
let valuelist =
use vlist = RToml.toValueList(toml)
for v in vlist do () //.. do something
// or iterate over the key-value pairs
RToml.stream (
toml,
(fun key value ->
if value.kind = Token.TRUE then
let keystr = key.ToString toml // struct to string
printfn $"{keystr} at pos:{key.key_begin} is set to true"
)
)
basic usage (Rust)
Supported types
- keys and basic primitives:
true/false,10,0.005,'string' - multiline strings:
'''content''',"""content""" - datetime:
1979-05-27T07:32:00Z - tables:
[entry],[entry.inner] - arrays of tables:
[[products]] - typed arrays:
[1, 2, 3],['a', 'b'],[true, false],[1.0, 2.0] - comments:
# comment(standalone or inline after a value)
Unsupported
Inline tables:
= { = { = 123, = "abc" } }
Use the equivalent flat forms instead:
[]
= 123
= "abc"
Mixed/nested arrays:
= [[[0,1,3],"abc"],{ = 1, = 2}]
Quoted keys:
= "value"
Parsing quoted keys requires collecting or transforming the key contents, which breaks the zero-copy/stackless property.
String types
Escape sequences are detected and flagged but not resolved — if your strings contain escapes, you handle the unescaping. String kinds after parsing:
VALID_STR— no escape sequences, use as-is (multiline leading newline already trimmed)ESC_STR— contains escape sequences, needs post-processingEMPTY_STR— empty string
Array types
Only homogeneous arrays are supported. The parser returns the validated region as one of:
ARR1_INT,ARR1_FLOAT,ARR1_BOOL,ARR1_STR
Token visualization

Future work (if there's real interest)
- proper benchmark data from real Cargo.toml / pyproject.toml files
- iterator for array values
- toml compliance tests
- codegen for other languages
- string-to-tagged-union deserialization
- SIMD intrinsics for string scanning