chatpack-0.2.0 is not a library.
Visit the last successful build:
chatpack-0.5.1
📦 chatpack
Compress chat exports from Telegram, WhatsApp, and Instagram into token-efficient formats for LLMs.
Why?
LLM context windows are expensive. A typical Telegram export is 80% metadata noise. chatpack strips it down to what matters: sender and content.
Before: 34,478 tokens (raw JSON)
After: 26,169 tokens (chatpack CSV)
━━━━━━━━━━━━━━━━━━━━━━━━
24% reduction ✨
Features
- 🚀 Fast — 20K+ messages/sec
- 📱 Multi-platform — Telegram, WhatsApp, Instagram
- 🔀 Smart merge — Consecutive messages from same sender → one entry
- 🎯 Filters — By date, by sender
- 📄 Formats — CSV, JSON, JSONL
Installation
Or build from source:
Usage
Basic
# Telegram JSON export
# WhatsApp TXT export
# Instagram JSON export
Output Formats
# CSV (default) — best for token efficiency
# JSON — structured array
# JSONL — one JSON per line, streaming-friendly
Filters
# Messages after date
# Messages before date
# Messages from specific user
# Combine filters
Metadata Options
# Include timestamps
# Include message IDs
# Include reply references
# Include edit timestamps
# All metadata
Other Options
# Custom output file
# Disable message merging
Output Examples
CSV (default)
Sender;Content
Alice;Hey! How are you?
Bob;Good thanks! Just finished the project.
Alice;Nice! Let's celebrate 🎉
JSON
JSONL
{"sender":"Alice","content":"Hey! How are you?"}
{"sender":"Bob","content":"Good thanks! Just finished the project."}
{"sender":"Alice","content":"Nice! Let's celebrate 🎉"}
Supported Export Formats
Telegram
Export via: Settings → Advanced → Export Telegram Data
- ✅ JSON format
- ✅ Message IDs, timestamps, replies, edits
- ✅ Nested text objects (bold, links, etc.)
Export via: Chat → ⋮ → More → Export chat → Without media
- ✅ TXT format (all locales)
- ✅ Auto-detects date format (US, EU, RU)
- ✅ Multiline messages
- ✅ Filters system messages
Export via: Settings → Your activity → Download your information
- ✅ JSON format
- ✅ Fixes Mojibake encoding (Cyrillic, etc.)
- ✅ Filters empty shares/reactions
Performance
Tested on 500MB files with toxic data (Zalgo, emoji spam, 100KB strings):
| Metric | Value |
|---|---|
| Throughput | 17-24K msg/s |
| Memory | ~2x file size |
| Max tested | 516 MB, 100K messages |
CLI Reference
chatpack <SOURCE> <INPUT> [OPTIONS]
Sources:
tg, telegram Telegram JSON export
wa, whatsapp WhatsApp TXT export
ig, instagram Instagram JSON export
Options:
-o, --output <FILE> Output file [default: optimized_chat.csv]
-f, --format <FORMAT> Output format: csv, json, jsonl [default: csv]
-t, --timestamps Include timestamps
-r, --replies Include reply references
-e, --edited Include edit timestamps
--ids Include message IDs
--no-merge Don't merge consecutive messages
--after <DATE> Filter: after date (YYYY-MM-DD)
--before <DATE> Filter: before date (YYYY-MM-DD)
--from <USER> Filter: from specific sender
-h, --help Print help
-V, --version Print version
Use Cases
Feed chat to LLM
# Then paste context.csv into ChatGPT/Claude
Build RAG dataset
# Each line is a document with timestamp
Analyze specific period
Export single person's messages