etoon 0.3.0

Fast TOON (Token-Oriented Object Notation) encoder. 8x faster than toons, 2.7x faster than the official TS SDK.
Documentation

etoon

OpenSSF Scorecard SLSA 3 VirusTotal cargo audit

Fast TOON (Token-Oriented Object Notation) encoder for Python, Rust, and CLI.

8× faster than toons, 2.7× faster than the official TS SDK, byte-identical output.

中文說明

Performance

Measured on a 50-doc payload (7480 bytes JSON → 4012 bytes TOON):

Encoder Time vs etoon
etoon (Rust, native) 11.9 μs 1.00×
etoon (Python, PyO3) 15.4 μs 1.27×
@toon-format/toon (TS SDK) 35.6 μs 2.94×
py-rtoon 85.9 μs 7.10×
toons 106.4 μs 8.79×

CLI via stdin pipe (Claude / Bash workflows):

CLI Per call Relative
etoon 0.43 ms 1.00×
official toon 50.7 ms 118× slower

Auto-detect mode (v0.2.0+) — handles JSON, mixed log, and plain text:

Input Size Per call
Pure JSON (1000 objects) 120KB 0.73 ms
Mixed log (5K JSON + 5K text) 600KB 1.93 ms
Plain text pass-through 300KB 0.56 ms

Reproduce

# Encoder core benchmark (Rust native, no I/O)
cargo run --release --bin bench payload.json

# CLI stdin pipe benchmark
python3 -c "
import json
data = [{'id': i, 'name': f'item_{i}', 'price': i*1.5, 'tags': ['a','b','c']} for i in range(1000)]
print(json.dumps(data))
" > /tmp/bench.json

# Time 200 runs
start=$(date +%s%N)
for i in $(seq 1 200); do etoon < /tmp/bench.json > /dev/null; done
end=$(date +%s%N)
echo "$(echo "scale=2; ($end - $start) / 200000000" | bc)ms avg"

Install

CLI binary (recommended for LLM workflows)

Pre-built — no Rust required:

Download from GitHub Releases (Linux/macOS/Windows, x86_64/aarch64):

# x86_64
curl -L https://github.com/coseto6125/etoon/releases/latest/download/etoon-linux-x86_64 -o etoon

# Apple Silicon / ARM server (aarch64)
curl -L https://github.com/coseto6125/etoon/releases/latest/download/etoon-linux-aarch64 -o etoon

chmod +x etoon
sudo mv etoon /usr/local/bin/   # or ~/.local/bin/
# Apple Silicon (M1/M2/M3/M4)
curl -L https://github.com/coseto6125/etoon/releases/latest/download/etoon-macos-aarch64 -o etoon

# Intel Mac
curl -L https://github.com/coseto6125/etoon/releases/latest/download/etoon-macos-x86_64 -o etoon

chmod +x etoon
sudo mv etoon /usr/local/bin/
# PowerShell
Invoke-WebRequest -Uri "https://github.com/coseto6125/etoon/releases/latest/download/etoon-windows-x86_64.exe" -OutFile "etoon.exe"

# Move to a directory in your PATH, e.g.:
Move-Item etoon.exe "$env:USERPROFILE\.local\bin\etoon.exe"

Each release includes SHA256 checksums, SLSA provenance attestation, and VirusTotal scan reports.

# 1. Verify checksum
curl -L https://github.com/coseto6125/etoon/releases/latest/download/SHA256SUMS.txt -o SHA256SUMS.txt
sha256sum -c SHA256SUMS.txt --ignore-missing

# 2. Verify SLSA provenance (requires gh CLI)
gh attestation verify etoon-linux-x86_64 --repo coseto6125/etoon

# 3. VirusTotal — scan report links are in the release notes

macOS unsigned binary note: xattr -d com.apple.quarantine etoon to bypass Gatekeeper.

From source (requires Rust toolchain):

cargo install etoon

Python library

pip install etoon

This installs the Python binding (etoon.dumps()), not the CLI binary. For the CLI, use one of the methods above.

Rust library

cargo add etoon --no-default-features

Usage

Python

import etoon
docs = [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]
print(etoon.dumps(docs))
# [2]{id,name}:
#   1,Alice
#   2,Bob

CLI (Bash pipe)

curl -s https://api.example.com/data | etoon
cat data.json | etoon -o output.toon
aws logs tail /ecs/my-service | etoon        # mixed log: JSON blocks → TOON, text → pass-through
etoon --strict < data.json                   # error on non-JSON (old behavior)

LLM Agents — Setup Guide

Append | etoon to shell commands so the AI reads TOON instead of raw JSON — 40-60% fewer tokens, lossless, auto-detects input format.

gh pr list --json number,title,author | etoon
aws logs tail /ecs/my-service | etoon
kubectl get pods -o json | etoon

Rules:

  • | etoon goes at the end of the pipeline — jq / grep / awk before it, never after
  • Auto-detects JSON / mixed log / plain text — safe to append to any command
  • Skip only for: byte-exact diff, single-value lookups (1-2 lines)

Claude Code

Add to CLAUDE.md (or .claude/CLAUDE.md) in your project root:

## Bash Output

Append `| etoon` as the final pipe segment of any Bash command whose output
reaches the conversation. etoon auto-detects JSON, mixed log, and plain text.
Skip only for byte-exact diff or single-value lookups (1-2 lines).

Cursor

Add to .cursor/rules/etoon.mdc:

---
description: Compress shell output with etoon
globs:
alwaysApply: true
---

When running shell/terminal commands, append `| etoon` as the last pipe segment.
etoon auto-detects JSON, mixed log, and plain text — safe for any command.
Skip only for byte-exact diff or single-value lookups (1-2 lines).

OpenAI Codex CLI

Add to AGENTS.md in your project root:

## Bash Output

Append `| etoon` as the final pipe segment of any shell command.
etoon auto-detects JSON, mixed log, and plain text (non-JSON passes through unchanged).
Skip only for byte-exact diff or single-value lookups (1-2 lines).

Windsurf / Other AI Editors

Add to .windsurfrules (Windsurf) or the equivalent rules file:

When executing shell commands, always append `| etoon` at the end of the pipeline.
etoon auto-detects input format: JSON → compact TOON, mixed log → JSON blocks encoded + text pass-through, plain text → unchanged.
Skip only for byte-exact diff or single-value lookups (1-2 lines).

Gemini CLI

Add to GEMINI.md in your project root:

## Bash Output

Append `| etoon` as the final pipe segment of any shell command.
etoon auto-detects JSON, mixed log, and plain text (non-JSON passes through unchanged).
Skip only for byte-exact diff or single-value lookups (1-2 lines).

ChatGPT / Custom GPTs

Add to system prompt or custom instructions:

When generating shell commands for the user, append `| etoon` as the last pipe segment.
etoon converts JSON to TOON (40-60% fewer tokens). Non-JSON passes through unchanged.

Rust

let json_bytes = serde_json::to_vec(&my_data)?;
let toon = etoon::toon::encode(&json_bytes)?;

Architecture

Python dict → orjson.dumps → JSON bytes → sonic-rs (SIMD parse) → walk → TOON string

Key optimizations:

  • sonic-rs SIMD JSON parser (~7× faster than serde_json)
  • orjson bridge — single boundary crossing (vs PyO3-based alternatives)
  • uniform-order table fast path — skips 300 key lookups per 50-row table
  • itoa specialized integer formatting

Compatibility

Output is byte-identical to the toons Python package (Apache 2.0) and the official toon-format/toon TypeScript SDK. Passes 111/111 TOON spec fixtures covering primitives, objects, arrays (primitive/tabular/nested/bulleted), and whitespace.

Sigil-prefixed keys (@, $, #)

Keys starting with @, $, or # are treated as valid identifiers — no quoting needed. This gives native support for:

Sigil Ecosystem Examples
@ AWS CloudWatch, Elasticsearch, Serilog, XML→JSON @timestamp, @message, @version
$ MongoDB, JSON Schema, AWS CloudFormation $match, $ref, $schema, $type
# JSON-LD, Azure Resource Manager #comment, #id
# AWS CloudWatch Insights output
echo '[{"@timestamp":"2026-04-06T12:00:01Z","@message":"POST /api/v1/users 504","statusCode":504}]' | etoon
# [1]{@timestamp,@message,statusCode}:
#   "2026-04-06T12:00:01Z",POST /api/v1/users 504,504

Token savings (5 AWS CloudWatch log entries)

tiktoken (offline, BPE tokenizer):

Tokenizer (model family) JSON TOON Saved
o200k_base (GPT-4o/5/o3) 484 334 31.0%
cl100k_base (GPT-4/3.5 ≈ Claude) 479 332 30.7%

tokencalculator.ai (online, estimated per-model cost):

Model JSON TOON Saved
Est. Tokens 314 189 39.8%
OpenAI GPT-5.4 $0.000785 $0.000473 39.7%
Claude Opus 4.6 $0.001570 $0.000945 39.8%
Gemini 3.1 Pro $0.000628 $0.000378 39.8%
DeepSeek V3.2 $0.000088 $0.000053 39.8%
Grok 4.20 $0.000063 $0.000038 39.7%

Savings increase with volume — 50 entries reach 35%+ (tiktoken) as the tabular header is amortized.

Advanced options

These are TOON spec optional parameters, intended for programmatic use in your codebase (Python / Rust library calls). The CLI | etoon pipe for LLM workflows uses defaults and does not need these.

# Custom delimiter (when values contain commas)
etoon.dumps(data, delimiter="|")   # or "\t"

# Key folding: collapse {a:{b:{c:1}}} → "a.b.c: 1"
etoon.dumps(data, fold_keys=True)
etoon.dumps(data, fold_keys=True, flatten_depth=2)  # partial fold

Limitations

  • Integers > 2⁶³ are lossily coerced via f64 (works for most common big integers that happen to be representable; arbitrary-precision is not supported).
  • Custom indent is hardcoded to 2 spaces (TOON spec default).

License

Apache 2.0. Test fixtures in tests/fixtures/ are sourced from the toons project (Apache 2.0).