excelstream
🦀 High-performance streaming Excel, CSV & Parquet library for Rust with constant memory usage
✨ Highlights
- 📊 XLSX, CSV & Parquet Support - Read/write Excel, CSV, and Parquet files
- 📉 Constant Memory - ~3-35 MB regardless of file size
- ☁️ Cloud Streaming - Direct S3/GCS uploads with ZERO temp files
- ⚡ High Performance - 94K rows/sec (S3), 1.2M rows/sec (CSV)
- 🔄 True Streaming - Process files row-by-row, no buffering
- 🗜️ Parquet Conversion - Stream Excel ↔ Parquet with constant memory
- 🐳 Production Ready - Works in 256 MB containers
🔥 What's New in v0.20.0
Writer Performance Optimizations - 3-8% faster with fewer memory allocations!
- 🚀 Eliminated Double Allocation - Removed unnecessary
Vec<String>buffer inwrite_row() - ⚡ Fast Integer Formatting - Using
itoacrate for 2-3x faster integer-to-string conversion - 📝 Optimized Column Letters - Direct buffer writing for column addressing (A, B, AA, etc.)
- 💾 Fewer Heap Allocations - Zero temp strings during cell writing
- 🎯 Scales with Width - Wider tables (20+ columns) see larger improvements (up to 8.5%)
Performance Gains (Verified with 1M rows):
- 10 columns: +6.1% faster (29,455 → 31,263 rows/sec)
- 20 columns: +8.5% faster (17,367 → 18,842 rows/sec)
- Memory usage: Virtually identical (+0.4%)
// Same API, now faster!
let mut writer = new?;
writer.write_row?;
for i in 1..=1_000_000
writer.save?;
See: Performance Report | PR #7 Details
Previous Release: v0.19.0
Performance & Memory Optimizations - Enhanced streaming reader and CSV parser!
- 🚀 Optimized Streaming Reader - Simplified buffer management with single-scan approach
- 💾 Reduced Memory Allocations - One fewer String buffer per iterator (lower heap usage)
- 📝 Smarter CSV Parsing - Pre-allocated buffers for typical row sizes
- 🎯 Cleaner Codebase - 36% code reduction in streaming reader (64 lines removed)
- 🔧 Better Maintainability - Simpler logic for easier debugging and contributions
// Streaming reader now uses optimized single-pass buffer scanning
let mut reader = open?;
for row in reader.rows_by_index?
Previous Release: v0.18.0
Cloud Replication & Transfer - Replicate Excel files between different cloud storage services!
use ;
let source = CloudSource ;
let destination = CloudDestination ;
let config = new
.with_chunk_size; // 10MB chunks
let replicate = with_clients;
let stats = replicate.execute.await?;
println!;
Features:
- 🔄 Cloud-to-Cloud Transfer - Replicate between S3, MinIO, R2, DO Spaces
- ⚡ True Streaming - Constant memory usage (~5-10MB), no memory peaks
- 🚀 Server-side Copy - Same-region transfers use native S3 copy API (instant)
- 🔑 Different Credentials - Each cloud can have different API keys
- 📊 Transfer Stats - Speed (MB/s), duration, bytes transferred
- 🏗️ Builder Pattern - Flexible configuration with custom clients
Also includes: v0.17.0 Multi-Cloud Explicit Credentials + v0.16.0 Parquet Support
See full changelog | Multi-cloud guide → | Cloud Replication →
📦 Quick Start
Installation
[]
= "0.18"
# Optional features
= { = "0.18", = ["cloud-s3"] } # S3 support
= { = "0.18", = ["cloud-gcs"] } # GCS support
= { = "0.18", = ["parquet-support"] } # Parquet conversion
Write Excel (Local)
use ExcelWriter;
let mut writer = new?;
// Write 1M rows with only 3 MB memory!
writer.write_header_bold?;
for i in 1..=1_000_000
writer.save?;
Read Excel (Streaming)
use ExcelReader;
let mut reader = open?;
// Process 1 GB file with only 12 MB memory!
for row in reader.rows?
S3 Streaming (v0.14+)
use S3ExcelWriter;
async
🎯 Why ExcelStream?
The Problem: Traditional libraries load entire files into memory
// ❌ Traditional: 1 GB file = 1+ GB RAM (OOM in containers!)
let workbook = new?;
The Solution: True streaming with constant memory
// ✅ ExcelStream: 1 GB file = 12 MB RAM
let mut reader = open?;
for row in reader.rows?
Performance Comparison
| Operation | Traditional | ExcelStream | Improvement |
|---|---|---|---|
| Write 1M rows | 100+ MB | 2.7 MB | 97% less memory |
| Read 1GB file | ❌ Crash | 12 MB | Works! |
| S3 upload 500K rows | Temp file | 34 MB | Zero disk |
| K8s pod (256MB) | ❌ OOMKilled | ✅ Works | Production ready |
☁️ Cloud Features
S3 Direct Streaming (v0.14)
Upload Excel files directly to S3 with ZERO temp files:
Performance (Real AWS S3):
| Dataset | Memory | Throughput | Temp Files |
|---|---|---|---|
| 10K rows | 15 MB | 11K rows/s | ZERO ✅ |
| 100K rows | 23 MB | 45K rows/s | ZERO ✅ |
| 500K rows | 34 MB | 94K rows/s | ZERO ✅ |
Perfect for:
- ✅ AWS Lambda (read-only filesystem)
- ✅ Docker containers (no disk space)
- ✅ Kubernetes CronJobs (limited memory)
S3-Compatible Services (v0.17+)
Stream to AWS S3, MinIO, Cloudflare R2, DigitalOcean Spaces, and other S3-compatible services with explicit credentials - no environment variables needed!
use ;
use S3ZipWriter;
use ;
// Example 1: AWS S3 with explicit credentials
let aws_creds = new;
let aws_config = builder
.credentials_provider
.region
.build;
let aws_client = from_conf;
// Example 2: MinIO with explicit credentials
let minio_creds = new;
let minio_config = builder
.credentials_provider
.endpoint_url
.region
.force_path_style // Required for MinIO
.build;
let minio_client = from_conf;
// Example 3: Cloudflare R2 with explicit credentials
let r2_creds = new;
let r2_config = builder
.credentials_provider
.endpoint_url
.region
.build;
let r2_client = from_conf;
// Write Excel file to ANY S3-compatible service
let s3_writer = new.await?;
let mut writer = from_s3_writer;
writer.write_header_bold.await?;
writer.write_row.await?;
writer.save.await?;
// Read Excel file from ANY S3-compatible service
let mut reader = from_s3_client.await?;
for row in reader.rows?
Supported Services:
| Service | Endpoint Example | Region |
|---|---|---|
| AWS S3 | (default) | us-east-1, ap-southeast-1, etc. |
| MinIO | http://localhost:9000 |
us-east-1 |
| Cloudflare R2 | https://<account>.r2.cloudflarestorage.com |
auto |
| DigitalOcean Spaces | https://nyc3.digitaloceanspaces.com |
us-east-1 |
| Backblaze B2 | https://s3.us-west-000.backblazeb2.com |
us-west-000 |
| Linode | https://us-east-1.linodeobjects.com |
us-east-1 |
✨ Key Features:
- 🔑 Explicit credentials - no environment variables needed
- 🌍 Multi-cloud support - use different credentials for each cloud
- 🚀 True streaming - only 19-20 MB memory for 100K rows
- ⚡ Concurrent uploads - upload to multiple clouds simultaneously
- 🔒 Type-safe - full compile-time checking
🔑 Full Multi-Cloud Guide → MULTI_CLOUD_CONFIG.md - Complete examples for AWS, MinIO, R2, Spaces, and B2!
GCS Direct Streaming (v0.14)
Upload Excel files directly to Google Cloud Storage with ZERO temp files:
use GCSExcelWriter;
async
Perfect for:
- ✅ Cloud Run (read-only filesystem)
- ✅ Cloud Functions (no disk space)
- ✅ GKE workloads (limited memory)
HTTP Streaming
Stream Excel files directly to web responses:
use HttpExcelWriter;
async
📊 CSV Support
13.5x faster than Excel for CSV workloads:
use CsvWriter;
let mut writer = new?;
writer.write_row?; // 1.2M rows/sec!
writer.save?;
Features:
- ✅ Zstd compression (
.csv.zst- 2.9x smaller) - ✅ Auto-detection (
.csv,.csv.gz,.csv.zst) - ✅ Streaming (< 5 MB memory)
🗜️ Parquet Support (v0.16+)
Convert between Excel and Parquet with constant memory streaming:
Excel → Parquet
use ExcelToParquetConverter;
let converter = new?;
let rows = converter.convert_to_parquet?;
println!;
Parquet → Excel
use ParquetToExcelConverter;
let converter = new?;
let rows = converter.convert_to_excel?;
println!;
Streaming with Progress
let converter = new?;
converter.convert_with_progress?;
Features:
- ✅ Constant memory - Processes in 10K row batches
- ✅ All data types - Strings, numbers, booleans, dates, timestamps
- ✅ Progress tracking - Monitor large conversions
- ✅ High performance - Efficient columnar format handling
Use Cases:
- Convert Excel reports to Parquet for data lakes
- Export Parquet data to Excel for analysis
- Integrate with Apache Arrow/Spark workflows
🚀 Use Cases
1. Large File Processing
// Process 500 MB Excel with only 25 MB RAM
let mut reader = open?;
for row in reader.rows?
2. Database Exports
// Export 1M database rows to Excel
let mut writer = new?;
let rows = db.query?;
for row in rows
writer.save?; // Only 3 MB memory used!
3. Cloud Pipelines
// Lambda function: DB → Excel → S3
let mut writer = builder
.bucket.key.build.await?;
let rows = db.query_stream.await?;
while let Some = rows.next.await
writer.save.await?; // No temp files, no disk!
📚 Documentation
- API Docs - Full API reference
- Examples - Code examples for all features
- CHANGELOG - Version history
- Performance - Detailed benchmarks
- Multi-Cloud Config - AWS, MinIO, R2, Spaces, B2 setup
Key Topics
- Excel Writing - Basic & advanced writing
- Excel Reading - Streaming read
- S3 Streaming - AWS S3 uploads
- Multi-Cloud Config - Multi-cloud credentials
- GCS Streaming - Google Cloud Storage uploads
- CSV Support - CSV operations
- Parquet Conversion - Excel ↔ Parquet
- Styling - Cell formatting & colors
🔧 Features
| Feature | Description |
|---|---|
default |
Core Excel/CSV with Zstd compression |
cloud-s3 |
S3 direct streaming (async) |
cloud-gcs |
GCS direct streaming (async) |
cloud-http |
HTTP response streaming |
parquet-support |
Parquet ↔ Excel conversion |
serde |
Serde serialization support |
parallel |
Parallel processing with Rayon |
⚡ Performance
Memory Usage (Constant):
- Excel write: 2.7 MB (any size)
- Excel read: 10-12 MB (any size)
- S3 streaming: 30-35 MB (any size)
- CSV write: < 5 MB (any size)
Throughput:
- Excel write: 42K rows/sec
- Excel read: 50K rows/sec
- S3 streaming: 94K rows/sec
- CSV write: 1.2M rows/sec
🛠️ Migration from v0.13
S3ExcelWriter is now async:
// OLD (v0.13 - sync)
writer.write_row?;
// NEW (v0.14 - async)
writer.write_row.await?;
All other APIs unchanged!
📋 Requirements
- Rust 1.70+
- Optional: AWS credentials for S3 features
🤝 Contributing
Contributions welcome! Please read CONTRIBUTING.md.
📄 License
MIT License - See LICENSE for details
🙏 Credits
- Built with s-zip for streaming ZIP
- AWS SDK for Rust
- All contributors and users!
Need help? Open an issue | Questions? Discussions