tinyjuice 0.2.1

Pluggable token compression for OpenHuman.
Documentation

TinyJuice is token compression for terminal-heavy agents. It sits between tool output and model context, turning noisy logs, diffs, JSON, search results, HTML, and source files into compact views that keep the signal visible.

Agents waste context on the same junk over and over: passing test chatter, duplicated JSON keys, huge Docker logs, repetitive grep hits, lockfile diffs, and markup nobody needs to reason about. TinyJuice cuts that noise before it hits the model.

The important part: compacted output stays recoverable. When TinyJuice shows a partial view, it stores the exact original behind a retrieval token instead of silently throwing data away.

Quick Setup

Install the CLI:

cargo install tinyjuice --locked

Run one hook installer:

Logo Client Command
Codex CLI tinyjuice install codex
Claude Code tinyjuice install claude-code

Custom paths, development installs, recovery, and tuning live in docs/agent-hooks/README.md.

Why It Helps

  • More useful context - failures, summaries, changed hunks, matching lines, signatures, and anomalies stay visible.
  • Less transcript waste - repeated structure, boilerplate, setup chatter, and markup get collapsed.
  • Recoverable partial views - exact originals can be pulled back when a compact view is not enough.
  • Agent-ready defaults - command-aware reducers understand common shell, git, cargo, npm, Docker, kubectl, database, cloud, lint, and test output.
  • Host-owned policy - OpenHuman and other runtimes decide when compression is full, light, off, or profile-driven.
  • Privacy-aware by design - analytics can use metadata, byte counts, latency, status, and strategy labels without requiring raw prompt text.

What It Compresses

Surface What stays visible
JSON Tables, schema shape, anomaly rows
Logs Errors, warnings, stack traces, summaries
Search results Top matches, file grouping, match counts
Diffs File headers, hunk headers, changed lines
Code Imports, signatures, top-level structure
HTML Readable page text without script and markup noise
Plain text Pass-through unless a host enables an ML callback

Benchmark Snapshot

The checked-in benchmark corpus uses 10 real snapshots per category and verifies inline accuracy plus CCR recovery for lossy compactions.

Category Cases Avg est. token reduction Avg latency
Service and Docker logs 10 86.3% 0.140 ms
HTML, RSS, and page snapshots 10 75.3% 0.164 ms
Unified diffs 10 71.2% 0.143 ms
JSON SmartCrusher 10 58.0% 0.429 ms
Rust source 10 51.9% 0.698 ms
Search results 10 44.8% 0.320 ms
Test failure logs 10 14.1% 0.034 ms
Plain text with ML off 10 0.0% 0.000 ms

These are local real-snapshot corpus measurements, not production-wide claims. See docs/benchmark and docs/benchmarking.md for the reproducible reports.

For Developers

The technical docs live in the wiki:

TinyJuice is pre-1.0. The CLI, router, command-rule engine, CCR recovery store, content detectors, native compressors, and OpenHuman-style adapter are in place; public API names may still move as host integration hardens.