temporal-cortex-toon
Pure-Rust encoder and decoder for TOON (Token-Oriented Object Notation) v3.0.
TOON is a compact, human-readable serialization format designed to reduce LLM token consumption when processing structured data. It achieves this through key folding, tabular compression, and context-dependent quoting.
Usage
use ;
// JSON → TOON
let json = r#"{"name":"Alice","scores":[95,87,92]}"#;
let toon = encode.unwrap;
assert_eq!;
// TOON → JSON (perfect roundtrip)
let back = decode.unwrap;
assert_eq!;
TOON Format Overview
Primitives and Objects
name: Alice ← unquoted string (no ambiguity)
age: 30 ← number
active: true ← boolean
id: "42" ← quoted (looks numeric, must preserve string type)
address: ← nested object (indentation, no braces)
city: Portland
state: OR
empty: ← empty object
Arrays
Three representations, chosen automatically for maximum compression:
Inline (all primitives):
tags[3]: rust,wasm,llm
Tabular (uniform objects with identical primitive-only keys):
attendees[2]{email,status}:
alice@co.com,accepted
bob@co.com,tentative
Expanded (mixed/complex content):
items[2]:
- kind: event
summary: Standup
- kind: event
summary: Sprint Planning
Quoting Rules
Strings are only quoted when they would be ambiguous:
| Condition | Example | Encoded as |
|---|---|---|
| Looks like bool | true |
"true" |
| Looks like null | null |
"null" |
| Looks like number | 42 |
"42" |
| Contains colon (in document context) | 10:30 AM |
"10:30 AM" |
| Contains comma (in inline/tabular context) | a, b |
"a, b" |
| Empty string | "" |
|
| Leading/trailing whitespace | hello |
" hello " |
| Contains brackets/braces | [1] |
"[1]" |
| Starts with hyphen | -foo |
"-foo" |
Architecture
encoder.rs ← JSON string → serde_json::Value → TOON string
decoder.rs ← TOON string → serde_json::Value → JSON string
error.rs ← ToonError enum (JsonParse, ToonParse, Encode)
types.rs ← ToonValue AST (reserved for future direct manipulation)
lib.rs ← Public API: encode(), decode(), ToonError
The encoder walks the serde_json::Value tree and selects the most compact TOON representation for each node. The decoder parses indentation-based TOON structure back into a serde_json::Value.
Key implementation detail: serde_json must use the preserve_order feature (enabled in workspace Cargo.toml) to maintain JSON key insertion order via IndexMap.
Testing
164 tests across three suites:
- 59 encoder tests — primitives, objects, arrays (inline/tabular/expanded), nesting, quoting edge cases
- 63 decoder tests — mirrors encoder tests plus string type inference, escape sequences, calendar-realistic data
- 42 roundtrip tests —
decode(encode(json)) == jsonfor all value types
License
MIT OR Apache-2.0