ainl-compression 0.1.0-alpha

Embedding-free input/output compression primitives for AINL hosts
Documentation

ainl-compression

Standalone prompt compression primitives for AINL hosts and external Rust agents.

Why this crate exists

  • Reusable outside ArmaraOS / OpenFang (cargo add ainl-compression)
  • Minimal dependency surface
  • Clear AINL ownership and attribution

Current scope

  • Input prompt compression (PromptCompressor)
  • Eco modes:
    • Off
    • Balanced
    • Aggressive
  • Natural-language mode parsing (EfficientMode::parse_natural_language)
  • Structured telemetry (CompressionMetrics)

Output/dense response compression is intentionally out-of-scope for now.

Basic usage

use ainl_compression::{EfficientMode, PromptCompressor};

let compressor = PromptCompressor::new(EfficientMode::Balanced);
let compressed = compressor.compress("Please summarize this long message...");
println!("compressed text: {}", compressed.text);

Telemetry callback

use ainl_compression::{EfficientMode, PromptCompressor};

let compressor = PromptCompressor::with_telemetry_callback(
    EfficientMode::Balanced,
    Some(Box::new(|m| {
        println!(
            "mode={:?} saved={} ({:.1}%)",
            m.mode, m.tokens_saved, m.savings_ratio_pct
        );
    })),
);

let _ = compressor.compress("Long prompt...");

Optional feature: graph-telemetry

Enable graph-telemetry when your host wants to serialize telemetry structures for graph/event pipelines:

ainl-compression = { version = "0.1.0-alpha", features = ["graph-telemetry"] }

This adds serde derives for shared telemetry structs without coupling this crate to any specific graph/memory runtime implementation.

ArmaraOS integration model

  • This crate stays runtime-agnostic.
  • ArmaraOS/OpenFang can:
    • persist aggregate metrics to openfang-memory
    • attach turn-level telemetry into episodic trace metadata in graph memory

That keeps the crate externally reusable while still advancing unified graph execution tracing inside ArmaraOS.