chatpack 0.2.4

Compress chat exports from Telegram, WhatsApp, and Instagram into token-efficient CSV for LLMs
Documentation

πŸ“¦ chatpack

Feed your chat history to LLMs. Compress exports 13x with CSV format.

CI codecov Crates.io docs.rs Downloads License: MIT

Platforms: Windows β€’ macOS β€’ Linux

The Problem

You want to ask Claude/ChatGPT about your conversations, but:

  • Raw exports are 80% metadata noise
  • JSON structure wastes tokens on brackets and keys
  • Context windows are expensive

The Solution

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Telegram JSON   β”‚     β”‚          β”‚     β”‚ Clean CSV       β”‚
β”‚ WhatsApp TXT    β”‚ ──▢ β”‚ chatpack β”‚ ──▢│ Ready for LLM   β”‚
β”‚ Instagram JSON  β”‚     β”‚          β”‚     β”‚ 13x less tokens β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Real Numbers

Format Input (Telegram JSON) Output Savings
CSV 11.2M tokens 850K tokens 92% (13x) πŸ”₯
JSONL 11.2M tokens 1.0M tokens 91% (11x)
JSON 11.2M tokens 1.3M tokens 88% (8x)

πŸ’‘ Use CSV for maximum token savings. JSONL is good for RAG pipelines. JSON keeps full structure but wastes tokens.

Use Cases

πŸ’¬ Chat with your chat history

chatpack tg telegram_export.json -o context.txt

# Paste into ChatGPT: "Based on this conversation, what did we decide about...?"

πŸ” Build RAG pipeline

chatpack tg chat.json -f jsonl -t -o dataset.jsonl

# Each line = one document with timestamp for vector DB

πŸ“Š Analyze conversations

chatpack wa chat.txt --from "Alice" --after 2024-01-01 -f json

# Filter and export specific messages

Features

  • πŸš€ Fast β€” 20K+ messages/sec
  • πŸ“± Multi-platform β€” Telegram, WhatsApp, Instagram
  • πŸ”€ Smart merge β€” Consecutive messages from same sender β†’ one entry
  • 🎯 Filters β€” By date, by sender
  • πŸ“„ Formats β€” CSV (13x compression), JSON, JSONL (for RAG)
  • πŸ“š Library β€” Use as Rust crate in your projects

Installation

Pre-built binaries

Platform Download
Windows chatpack-windows-x64.exe
macOS (Intel) chatpack-macos-x64
macOS (Apple Silicon) chatpack-macos-arm64
Linux chatpack-linux-x64

Via Cargo

cargo install chatpack

As a library

[dependencies]

chatpack = "0.2"

Quick Start (CLI)

# Telegram

chatpack tg result.json


# WhatsApp  

chatpack wa chat.txt


# Instagram

chatpack ig message_1.json

Output: optimized_chat.csv β€” ready to paste into ChatGPT/Claude.

Library Usage

Basic example

use chatpack::prelude::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Parse a Telegram export
    let parser = create_parser(Source::Telegram);
    let messages = parser.parse("telegram_export.json")?;

    // Merge consecutive messages from the same sender
    let merged = merge_consecutive(messages);

    // Write to JSON
    write_json(&merged, "output.json", &OutputConfig::new())?;

    Ok(())
}

Auto-detect format

use chatpack::parsers::parse_auto;

// Automatically detects Telegram, WhatsApp, or Instagram
let messages = parse_auto("unknown_chat.json")?;

Filter messages

use chatpack::prelude::*;

let parser = create_parser(Source::Telegram);
let messages = parser.parse("chat.json")?;

// Filter by sender
let config = FilterConfig::new()
    .with_user("Alice".to_string());
let alice_only = apply_filters(messages.clone(), &config);

// Filter by date range
let config = FilterConfig::new()
    .after_date("2024-01-01")?
    .before_date("2024-06-01")?;
let filtered = apply_filters(messages, &config);

Output formats

use chatpack::prelude::*;

let messages = vec![
    InternalMessage::new("Alice", "Hello!"),
    InternalMessage::new("Bob", "Hi there!"),
];

// Minimal output (sender + content only)
let config = OutputConfig::new();

// Full metadata (timestamps, IDs, replies, edits)
let config = OutputConfig::all();

// Custom selection
let config = OutputConfig::new()
    .with_timestamps()
    .with_ids();

// Write to different formats
write_json(&messages, "output.json", &config)?;
write_jsonl(&messages, "output.jsonl", &config)?;
write_csv(&messages, "output.csv", &config)?;

Processing statistics

use chatpack::prelude::*;

let original_count = messages.len();
let merged = merge_consecutive(messages);

let stats = ProcessingStats::new(original_count, merged.len());
println!("Compression: {:.1}%", stats.compression_ratio());
println!("Messages saved: {}", stats.messages_saved());

πŸ“š Full API documentation: docs.rs/chatpack

CLI Reference

# Output formats

chatpack tg chat.json -f csv      # 13x compression (default)

chatpack tg chat.json -f json     # Structured array

chatpack tg chat.json -f jsonl    # One JSON per line (for RAG)


# Filters  

chatpack tg chat.json --after 2024-01-01

chatpack tg chat.json --before 2024-06-01

chatpack tg chat.json --from "Alice"


# Metadata

chatpack tg chat.json -t          # Add timestamps

chatpack tg chat.json -r          # Add reply references

chatpack tg chat.json -e          # Add edit timestamps

chatpack tg chat.json --ids       # Add message IDs

chatpack tg chat.json -t -r -e --ids  # All metadata


# Other options

chatpack tg chat.json --no-merge  # Don't merge consecutive messages

chatpack tg chat.json -o out.csv  # Custom output path

Documentation

Guide Description
πŸ“€ Export Guide How to export from Telegram, WhatsApp, Instagram
πŸ“– Usage Guide All commands, flags, filters, formats
πŸ“Š Benchmarks Performance stats and compression metrics
πŸ§ͺ Stress Testing Generate toxic data and run stress tests
πŸ“š API Docs Full library documentation

Supported Platforms

Source Format Features
Telegram JSON IDs, timestamps, replies, edits
WhatsApp TXT Auto-detect locale (US/EU/RU), multiline
Instagram JSON Mojibake fix, empty message filter

Performance

Metric Value
Speed 20-50K messages/sec
CSV compression 13x (92% token reduction)
Tested file size 500MB+

License

MIT Β© Mukhammedali Berektassuly