tinyjuice 0.2.0

Pluggable token compression for OpenHuman.
Documentation

TinyJuice is a Rust token-compression engine for agent context. It gives OpenHuman and other Rust hosts a small, inspectable boundary for shrinking large tool outputs before they enter a model context, while keeping exact originals recoverable when a lossy view is shown.

Most agent systems pay the same context tax over and over: 5,000-line logs, huge JSON lists, repeated grep output, lockfile diffs, rendered HTML, and full source files all land in the model as raw text. TinyJuice routes those blobs by content kind, applies a deterministic compressor tuned to the signal in that kind, and reports what changed.

The result is not a magic "make prompts smaller" black box. It is a pluggable, auditable compression layer with conservative pass-through behavior, recovery markers for partial views, and host-owned policy for cost attribution.

Why TinyJuice

  • Content-aware by default - JSON, code, logs, search results, diffs, HTML, and plain text take different paths instead of one generic truncation rule.
  • Recoverable lossy views - the CCR cache stores exact originals and appends a tokenjuice_retrieve footer whenever data is dropped.
  • Agent-profile policy - hosts can run full, light, off, or runtime auto profiles per agent instead of using one global behavior.
  • Command-aware reduction - built-in rules compact common shell, git, cargo, npm, docker, kubectl, database, cloud, lint, and test outputs.
  • OpenHuman-ready boundary - the core crate avoids OpenHuman runtime dependencies; adapters install configuration, ML callbacks, and savings recorders from the host side.
  • No raw-content analytics requirement - the dashboard consumes metadata, token and byte counts, latency, status, and strategy labels, not prompt text.

TinyJuice is designed for the work agents actually do: reading too much, searching broadly, running noisy commands, and needing a compact but reversible view that keeps failures, anomalies, changed hunks, signatures, and matching lines visible.

How It Works

tool output / file / web payload
        |
        v
ContentHint + structural detection
        |
        v
JSON | Code | Log | Search | Diff | HTML | PlainText
        |
        v
specialized compressor or command-rule reducer
        |
        v
pass-through if unsafe, too small, disabled, or not smaller
        |
        v
CCR offload + retrieval footer when the view is lossy

The router is intentionally fail-soft. If it cannot shrink safely, it returns the original bytes unchanged.

Compression Surfaces

  • JSON SmartCrusher - renders repeated object arrays as compact tables and keeps anomaly rows when large arrays are row-dropped.
  • Code compressor - keeps imports, signatures, shallow structure, and important markers while collapsing deep bodies.
  • Log compressor - preserves failures, warnings, summaries, stack traces, and command-rule outputs while dropping passing noise.
  • Search compressor - groups grep/ripgrep output by file, ranks matches, and keeps top hits with per-file tallies.
  • Diff compressor - keeps patch structure and changed lines, collapses long context and noisy lockfile/bundle hunks.
  • HTML compressor - extracts readable text from rendered markup.
  • Plain-text ML slot - optional host-provided callback for learned text compression; disabled by default.
  • Generic command fallback - line-oriented head/tail reduction for command output when no specialized rule wins.

TinyJuice does not publish compression percentage claims yet. Throughput benchmarks exist for hot paths, but ratio and quality claims require benchmark fixtures that prove retained facts, latency, reversibility, and regression safety.

Quick Start

Add TinyJuice to a Rust project once published:

[dependencies]
tinyjuice = "0.2"

Use the small public trait scaffold when you want a simple strategy boundary:

use tinyjuice::{CompressionConfig, CompressionInput, Compressor, PassthroughCompressor};

fn main() -> Result<(), tinyjuice::TinyJuiceError> {
    let compressor = PassthroughCompressor;
    let output = compressor.compress(
        CompressionInput::new("Keep this text unchanged for now."),
        &CompressionConfig::default(),
    )?;

    assert_eq!(output.report.strategy, "passthrough");
    Ok(())
}

Use the content router for real tool-output compaction:

use tinyjuice::{CompressOptions, ContentHint, compress_content};

async fn compact_payload(big_payload: &str) {
    let hint = ContentHint {
        source_tool: Some("read_file".to_string()),
        extension: Some("json".to_string()),
        ..Default::default()
    };

    let result = compress_content(big_payload, Some(hint), &CompressOptions::default()).await;
    if result.applied {
        println!("{} -> {} bytes", result.original_bytes, result.compacted_bytes);
    }
}

OpenHuman-style tool output integration goes through:

use tinyjuice::{AgentTokenjuiceCompression, compact_tool_output_with_policy};

async fn compact_command_output(command_output: &str) {
    let (_text, _stats) = compact_tool_output_with_policy(
        "shell",
        Some(&serde_json::json!({ "command": "cargo test" })),
        command_output,
        Some(101),
        AgentTokenjuiceCompression::Full,
    ).await;
}

SDK and Plugin Integration

TinyJuice now exposes two integration paths:

  • Rust hosts use the crate SDK directly.
  • Non-Rust plugins and harnesses call the tinyjuice reduce-json protocol.

The SDK accepts a host-neutral ToolExecutionInput with tool name, command, argv, stdout/stderr or combined text, exit code, cwd, and metadata. The response contains the inline text plus metadata about the applied content kind, compressor, token estimate, byte counts, and CCR recovery token when one was created. Do not log the request body from adapters; tool output may contain prompts, credentials, or private context.

Rust SDK

use tinyjuice::{
    AgentTokenjuiceCompression, TinyJuiceHost, TinyJuiceSdk, ToolExecutionInput,
};

async fn compact_for_harness(tool_output: String, exit_code: i32) {
    let sdk = TinyJuiceSdk::new(TinyJuiceHost::RustHarness)
        .with_profile(AgentTokenjuiceCompression::Full);

    let response = sdk
        .compress_tool_output(ToolExecutionInput {
            tool_name: "shell".to_string(),
            command: Some("cargo test".to_string()),
            argv: Some(vec!["cargo".to_string(), "test".to_string()]),
            combined_text: Some(tool_output),
            exit_code: Some(exit_code),
            ..Default::default()
        })
        .await;

    println!("{}", response.inline_text);
}

Use TinyJuiceHost::OpenHuman for OpenHuman adapters and TinyJuiceHost::RustHarness for standalone Rust harnesses. Hosts should map their own config into CompressOptions, choose the per-agent profile, and expose CCR retrieval before enabling lossy compaction in production.

JSON Protocol

Build the binary locally:

cargo build --release --bin tinyjuice

Send a full SDK request:

{
  "host": "generic-json",
  "profile": "full",
  "input": {
    "toolName": "shell",
    "command": "cargo test",
    "argv": ["cargo", "test"],
    "combinedText": "large tool output...",
    "exitCode": 0,
    "metadata": {
      "source": "custom-harness"
    }
  },
  "options": {
    "minBytesToCompress": 512,
    "maxInlineChars": 1200,
    "ccrEnabled": true
  }
}

Run it through the protocol:

tinyjuice reduce-json payload.json
cat payload.json | tinyjuice reduce-json --host generic-json -

A bare ToolExecutionInput object is also accepted when the host, profile, and options can stay at defaults.

Codex and Claude Code Hooks

Build or install a tinyjuice binary, then merge a hook into the host config:

tinyjuice install codex
tinyjuice install claude-code

The Codex installer updates ~/.codex/hooks.json with a PostToolUse hook for Bash tool output. When TinyJuice compacts a large result, it emits hookSpecificOutput.additionalContext, matching Codex's hook output model.

The Claude Code installer updates ~/.claude/settings.json with a PostToolUse hook for Bash tool output. When TinyJuice compacts a large result, it emits hookSpecificOutput.updatedToolOutput, so Claude Code sees the compacted tool result rather than the noisy original.

Both installers:

  • preserve existing hooks and settings
  • replace an older TinyJuice hook for the same host
  • write a .bak file next to the edited JSON file
  • expect tinyjuice to be on PATH unless --binary is supplied

Examples:

tinyjuice install codex --binary /usr/local/bin/tinyjuice
tinyjuice install claude-code --path ~/.claude/settings.json

The raw hook entrypoints are also available for custom installers:

tinyjuice codex-post-tool-use
tinyjuice claude-code-post-tool-use

Hook invocations use a disk-backed CCR store so recovery tokens survive the short-lived hook process. By default the store lives under the user's cache directory at tinyjuice/ccr; override it with:

export TINYJUICE_CCR_DIR=/path/to/tinyjuice-ccr

Recover a full original from a hook footer:

tinyjuice retrieve <token>

Useful hook tuning variables:

export TINYJUICE_MIN_BYTES_TO_COMPRESS=2048
export TINYJUICE_MAX_INLINE_CHARS=1200
export TINYJUICE_CCR_MIN_TOKENS=500
export TINYJUICE_CCR_ENABLED=true

Templates remain available for inspection or custom packaging:

tinyjuice hosts
tinyjuice host-template codex
tinyjuice host-template claude-code
tinyjuice host-template generic-json

Local Development

cargo fmt --check
cargo clippy --all-targets -- -D warnings
cargo test
cargo run --example passthrough
cargo run --bin tinyjuice -- hosts
cargo run --bin tinyjuice -- host-template codex

Run hot-path benchmarks:

cargo bench

Run fixture-driven compression benchmarks:

cargo run --release --example compression_benchmark -- --iterations 20
cargo run --release --example compression_benchmark -- --iterations 20 --format json

Fixture benchmark snapshot from cargo run --release --example compression_benchmark -- --iterations 20:

Use case Compressor Est. token reduction Avg latency CCR recovery
JSON service inventory SmartCrusher 94.9% 0.397 ms yes
Cargo test failure log Log 93.6% 0.667 ms yes
Docker service log Log 99.8% 1.110 ms yes
Ripgrep search results Search 75.3% 0.034 ms yes
Unified diff Diff 84.3% 0.008 ms yes
HTML status report HTML 61.2% 0.063 ms yes
Rust source file Code 88.6% 0.199 ms yes
Plain text with ML off None 0.0% 0.000 ms n/a

CCR recovery byte-compares the retrieved original for every lossy compaction. These numbers are generated-fixture measurements, not production corpus claims.

See docs/benchmarking.md for benchmark scope, comparison targets, and reporting cautions. See docs/benchmark for human-readable before/after sample reports and accuracy-check details.

Run the local analytics interface:

cd interface
npm install
npm run dev

The interface accepts metadata-oriented compression records. Do not feed raw prompt, context, tool output, or credentials into analytics datasets.

Crate Layout

src/
  compress.rs        Universal content router
  compressors/       JSON, code, log, search, diff, HTML, ML, generic paths
  detect/            Content-kind hints and structural detection
  cache/             CCR offload, retrieval markers, memory/disk store
  rules/             Built-in + user + project command reduction rules
  reduce.rs          Rule-engine reduction pipeline
  sdk.rs             Host-neutral SDK and reduce-json request/response types
  tool_integration.rs OpenHuman-style tool-output adapter
  compressor/        Small public Compressor trait scaffold
  config/            Small public CompressionConfig scaffold
  openhuman/         Runtime-neutral OpenHuman adapter types
  savings.rs         Host-installed savings attribution hook
interface/           Self-hostable analytics UI
wiki/                Technical GitHub wiki source
docs/references/     Design references and candidate strategy specs

Documentation

Status

TinyJuice is pre-1.0. The router, command-rule engine, CCR recovery store, content detectors, several native compressors, the OpenHuman-style tool adapter, and the analytics interface are implemented. Public API names may still move as OpenHuman integration hardens.

The project boundary is deliberate: keep the core crate small, do not add OpenHuman runtime dependencies without a feature or adapter boundary, and do not claim compression percentages until benchmark fixtures exist.