# π¦ chatpack
> Feed your chat history to LLMs. Compress exports **13x** with CSV format.
[](https://github.com/berektassuly/chatpack/actions/workflows/ci.yml)
[](https://codecov.io/gh/berektassuly/chatpack)
[](https://crates.io/crates/chatpack)
[](https://docs.rs/chatpack)
[](https://crates.io/crates/chatpack)
[](https://opensource.org/licenses/MIT)
**Platforms:** Windows β’ macOS β’ Linux
## The Problem
You want to ask Claude/ChatGPT about your conversations, but:
- Raw exports are **80% metadata noise**
- JSON structure wastes tokens on brackets and keys
- Context windows are expensive
## The Solution
```
βββββββββββββββββββ ββββββββββββ βββββββββββββββββββ
β Telegram JSON β β β β Clean CSV β
β WhatsApp TXT β βββΆβ chatpack β βββΆβ Ready for LLM β
β Instagram JSON β β β β 13x less tokens β
βββββββββββββββββββ ββββββββββββ βββββββββββββββββββ
```
## Real Numbers
| **CSV** | 11.2M tokens | 850K tokens | **92% (13x)** π₯ |
| JSONL | 11.2M tokens | 1.0M tokens | 91% (11x) |
| JSON | 11.2M tokens | 1.3M tokens | 88% (8x) |
> π‘ **Use CSV for maximum token savings.** JSONL is good for RAG pipelines. JSON keeps full structure but wastes tokens.
## Use Cases
### π¬ Chat with your chat history
```bash
chatpack tg telegram_export.json -o context.txt
# Paste into ChatGPT: "Based on this conversation, what did we decide about...?"
```
### π Build RAG pipeline
```bash
chatpack tg chat.json -f jsonl -t -o dataset.jsonl
# Each line = one document with timestamp for vector DB
```
### π Analyze conversations
```bash
chatpack wa chat.txt --from "Alice" --after 2024-01-01 -f json
# Filter and export specific messages
```
## Features
- π **Fast** β 20K+ messages/sec
- π± **Multi-platform** β Telegram, WhatsApp, Instagram
- π **Smart merge** β Consecutive messages from same sender β one entry
- π― **Filters** β By date, by sender
- π **Formats** β CSV (13x compression), JSON, JSONL (for RAG)
- π **Library** β Use as Rust crate in your projects
## Installation
### Pre-built binaries
| Windows | [chatpack-windows-x64.exe](https://github.com/berektassuly/chatpack/releases/latest/download/chatpack-windows-x64.exe) |
| macOS (Intel) | [chatpack-macos-x64](https://github.com/berektassuly/chatpack/releases/latest/download/chatpack-macos-x64) |
| macOS (Apple Silicon) | [chatpack-macos-arm64](https://github.com/berektassuly/chatpack/releases/latest/download/chatpack-macos-arm64) |
| Linux | [chatpack-linux-x64](https://github.com/berektassuly/chatpack/releases/latest/download/chatpack-linux-x64) |
### Via Cargo
```bash
cargo install chatpack
```
### As a library
```toml
[dependencies]
chatpack = "0.2"
```
## Quick Start (CLI)
```bash
# Telegram
chatpack tg result.json
# WhatsApp
chatpack wa chat.txt
# Instagram
chatpack ig message_1.json
```
**Output:** `optimized_chat.csv` β ready to paste into ChatGPT/Claude.
## Library Usage
### Basic example
```rust
use chatpack::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Parse a Telegram export
let parser = create_parser(Source::Telegram);
let messages = parser.parse("telegram_export.json")?;
// Merge consecutive messages from the same sender
let merged = merge_consecutive(messages);
// Write to JSON
write_json(&merged, "output.json", &OutputConfig::new())?;
Ok(())
}
```
### Auto-detect format
```rust
use chatpack::parsers::parse_auto;
// Automatically detects Telegram, WhatsApp, or Instagram
let messages = parse_auto("unknown_chat.json")?;
```
### Filter messages
```rust
use chatpack::prelude::*;
let parser = create_parser(Source::Telegram);
let messages = parser.parse("chat.json")?;
// Filter by sender
let config = FilterConfig::new()
.with_user("Alice".to_string());
let alice_only = apply_filters(messages.clone(), &config);
// Filter by date range
let config = FilterConfig::new()
.after_date("2024-01-01")?
.before_date("2024-06-01")?;
let filtered = apply_filters(messages, &config);
```
### Output formats
```rust
use chatpack::prelude::*;
let messages = vec![
InternalMessage::new("Alice", "Hello!"),
InternalMessage::new("Bob", "Hi there!"),
];
// Minimal output (sender + content only)
let config = OutputConfig::new();
// Full metadata (timestamps, IDs, replies, edits)
let config = OutputConfig::all();
// Custom selection
let config = OutputConfig::new()
.with_timestamps()
.with_ids();
// Write to different formats
write_json(&messages, "output.json", &config)?;
write_jsonl(&messages, "output.jsonl", &config)?;
write_csv(&messages, "output.csv", &config)?;
```
### Processing statistics
```rust
use chatpack::prelude::*;
let original_count = messages.len();
let merged = merge_consecutive(messages);
let stats = ProcessingStats::new(original_count, merged.len());
println!("Compression: {:.1}%", stats.compression_ratio());
println!("Messages saved: {}", stats.messages_saved());
```
π **Full API documentation:** [docs.rs/chatpack](https://docs.rs/chatpack)
## CLI Reference
```bash
# Output formats
chatpack tg chat.json -f csv # 13x compression (default)
chatpack tg chat.json -f json # Structured array
chatpack tg chat.json -f jsonl # One JSON per line (for RAG)
# Filters
chatpack tg chat.json --after 2024-01-01
chatpack tg chat.json --before 2024-06-01
chatpack tg chat.json --from "Alice"
# Metadata
chatpack tg chat.json -t # Add timestamps
chatpack tg chat.json -r # Add reply references
chatpack tg chat.json -e # Add edit timestamps
chatpack tg chat.json --ids # Add message IDs
chatpack tg chat.json -t -r -e --ids # All metadata
# Other options
chatpack tg chat.json --no-merge # Don't merge consecutive messages
chatpack tg chat.json -o out.csv # Custom output path
```
## Documentation
| π€ [Export Guide](docs/EXPORT_GUIDE.md) | How to export from Telegram, WhatsApp, Instagram |
| π [Usage Guide](docs/USAGE.md) | All commands, flags, filters, formats |
| π [Benchmarks](docs/BENCHMARKS.md) | Performance stats and compression metrics |
| π§ͺ [Stress Testing](docs/STRESS_TEST.md) | Generate toxic data and run stress tests |
| π [API Docs](https://docs.rs/chatpack) | Full library documentation |
## Supported Platforms
| Telegram | JSON | IDs, timestamps, replies, edits |
| WhatsApp | TXT | Auto-detect locale (US/EU/RU), multiline |
| Instagram | JSON | Mojibake fix, empty message filter |
## Performance
| Speed | 20-50K messages/sec |
| CSV compression | 13x (92% token reduction) |
| Tested file size | 500MB+ |
## License
[MIT](LICENSE) Β© [Mukhammedali Berektassuly](https://berektassuly.com)