Tauq - Token-Efficient Data Notation
44% fewer tokens than JSON overall. 11% more efficient than TOON. Verified with tiktoken.
What is Tauq?
Tauq (τq) is two things:
- Tauq Notation (
.tqn): A schema-driven data format that achieves 44-54% fewer tokens than JSON (verified with tiktoken cl100k_base). - Tauq Query (
.tqq): A pre-processor with shell integration for data transformations.
Built for the AI era where every token counts.
Benchmark (1000 Records)
| Format | Tokens | vs JSON |
|---|---|---|
| JSON (minified) | 24,005 | baseline |
| TOON | 12,002 | -50.0% |
| Tauq | 11,012 | -54.1% |
All counts verified with tiktoken cl100k_base (GPT-4/Claude tokenizer).
Overall (10 datasets, 55,647 tokens): Tauq saves 44.2% vs JSON, 10.8% vs TOON. See benchmarks/ for full results.
Quick Example
JSON:
Tauq:
!def User id name
1 Alice
2 Bob
Features
Token-Optimal
- 44-54% fewer tokens than JSON (verified benchmarks)
- 11% more efficient than TOON overall
- Space delimiters tokenize better than commas
True Streaming
StreamingParseriterator API- Process records one at a time
- No count required (unlike TOON's
[N])
Schema-Driven
- Define data shapes with
!def - Switch schemas with
!use - Nested types and typed arrays
🔧 Programmable
- Tauq Query for data transformations
- Unix pipe model
- Polyglot support (Python, Rhai, JavaScript)
🛠️ Production-Ready CLI
tauq build- Parse to JSONtauq format- JSON → Tauqtauq minify- Compress to one linetauq exec- Run Tauq Query pipelinestauq validate- Check syntax
Quick Start
Installation
# Install the tauq package
Language Bindings
Tauq is available for your favorite languages:
- Python:
pip install tauq - JavaScript:
npm install tauq - Go:
go get github.com/epistates/tauq - Rust: Add
tauq = "0.1"to yourCargo.toml
Hello World
Create config.tqn:
app_name "MyService"
version "1.0.0"
port 8080
debug true
features [api websockets metrics]
Parse to JSON:
{
}
Syntax Guide
Simple Values
name "Alice"
age 30
active true
score 99.5
missing null
role admin # Barewords don't need quotes
Arrays
tags [web api backend]
ids [1 2 3 4 5]
mixed [1 "two" true null]
Tabular Data (The Killer Feature)
!def User id name email role
1 Alice "alice@example.com" admin
2 Bob "bob@example.com" user
3 Carol "carol@example.com" user
Schema Block
Define schemas upfront with --- to separate from data:
!def User id name role
---
users [
!use User
1 Alice admin
2 Bob user
]
The --- separator clears the implicit schema scope, allowing structured key-value data that uses !use inside arrays.
Nested Types
!def Address street city
!def User id name addr:Address
1 Alice { "123 Main" "NYC" }
2 Bob { "456 Oak" "LA" }
Lists of Objects
!def Employee name role
!def Department name budget employees:[Employee]
Engineering 1000000 [
Alice "Principal Engineer"
Bob "Senior Engineer"
]
Minified Syntax
!def U id name; 1 Alice; 2 Bob
All on one line for maximum compression!
Examples
We have provided a comprehensive set of examples in the examples/ directory:
- Basics: Simple configuration and primitive types.
- Schemas: Typed schemas and nested types.
- Modularity: Multi-file imports and modular configurations.
- Real World: Production configurations like Kubernetes deployments.
- Queries: ETL pipelines and data generation with TauqQ.
- Minified: Compact single-line syntax examples.
CLI Usage
Build: Tauq → JSON
# To stdout
# To file with pretty formatting
# From stdin
|
Format: JSON → Tauq
The formatter intelligently detects arrays of uniform objects and creates schemas automatically:
# Convert JSON to Tauq (auto-generates schemas for nested arrays)
# From stdin
|
# Output:
# !def User id name
# ---
# users [
# !use User
# 1 Alice
# 2 Bob
# ]
Execute Tauq Query
# Run data transformations
# Run in SAFE MODE (disable shell execution)
Minify
# Compress to single line
Contributing
Tauq is in active development. Contributions welcome!
Areas of interest:
- Parser optimizations
- Error message improvements
- Language bindings (Python, JS, Go)
- Documentation
- Real-world use cases
License
MIT
Tauq (τq) - Stop wasting tokens on JSON. Start using the future. 🚀