s-zip
███████╗ ███████╗██╗██████╗
██╔════╝ ╚══███╔╝██║██╔══██╗
███████╗█████╗ ███╔╝ ██║██████╔╝
╚════██║╚════╝ ███╔╝ ██║██╔═══╝
███████║ ███████╗██║██║
╚══════╝ ╚══════╝╚═╝╚═╝
s-zip is a streaming ZIP reader and writer designed for backend systems that need
to process large archives with minimal memory usage.
The focus is not on end-user tooling, but on providing a reliable ZIP building block for servers, batch jobs, and data pipelines.
Why s-zip?
Most ZIP libraries assume small files or in-memory buffers.
s-zip is built around streaming from day one.
- Constant memory usage
- Suitable for very large files
- Works well in containers and memory-constrained environments
- Designed for backend and data-processing workloads
Key Features
- Streaming ZIP writer (no full buffering)
- AES-256 encryption 🔐 NEW! Password-protect files with WinZip-compatible encryption
- Async/await support ⚡ Compatible with Tokio runtime
- Async ZIP reader 📖 Stream ZIPs from any source (S3, HTTP, files)
- Cloud storage adapters 🌩️ Stream directly to/from AWS S3, Google Cloud Storage, MinIO, and S3-compatible services
- Arbitrary writer support (File, Vec, network streams, etc.)
- Streaming ZIP reader with minimal memory footprint
- ZIP64 support for files >4GB
- Multiple compression methods: DEFLATE, Zstd (optional)
- Predictable memory usage: ~2-5 MB constant with 1MB buffer threshold
- High performance: Zstd 3x faster than DEFLATE with 11-27x better compression
- Concurrent operations: Create multiple ZIPs simultaneously with async
- Rust safety guarantees
- Backend-friendly API
Non-goals
- Not a CLI replacement for zip/unzip
- Not focused on desktop or interactive usage
- Not optimized for small files convenience
Typical Use Cases
- Web applications (Axum, Actix, Rocket) - Generate ZIPs on-demand
- Cloud storage - Stream ZIPs directly to AWS S3, Google Cloud Storage without local disk usage
- Data exports - Generate large ZIP exports for reports, datasets, backups
- Data pipelines - ETL jobs, batch processing, log aggregation
- Infrastructure tools - ZIP as intermediate format for deployments, artifacts
- Real-time streaming - WebSocket, SSE, HTTP chunked responses
Performance Highlights
Based on comprehensive benchmarks (see BENCHMARK_RESULTS.md):
| Metric | DEFLATE level 6 | Zstd level 3 | Improvement |
|---|---|---|---|
| Speed (1MB) | 610 MiB/s | 2.0 GiB/s | 3.3x faster ⚡ |
| File Size (1MB compressible) | 3.16 KB | 281 bytes | 11x smaller 🗜️ |
| File Size (10MB compressible) | 29.97 KB | 1.12 KB | 27x smaller 🗜️ |
| Memory Usage | 2-5 MB constant | 2-5 MB constant | Same ✓ |
| CPU Usage | Moderate | Low-Moderate | Better ✓ |
Key Benefits:
- ✅ No temp files - Direct streaming saves disk I/O
- ✅ ZIP64 support for files >4GB
- ✅ Zstd compression: faster + smaller than DEFLATE
- ✅ Constant memory usage regardless of archive size
Quick Start
Add this to your Cargo.toml:
[]
= "0.7"
# With AES-256 encryption support
= { = "0.7", = ["encryption"] }
# With async support (Tokio runtime)
= { = "0.7", = ["async"] }
# With AWS S3 cloud storage support
= { = "0.7", = ["cloud-s3"] }
# With Google Cloud Storage support
= { = "0.7", = ["cloud-gcs"] }
# With all cloud storage providers
= { = "0.7", = ["cloud-all"] }
# With async + Zstd compression + encryption
= { = "0.7", = ["async", "async-zstd", "encryption"] }
Optional Features
| Feature | Description | Dependencies |
|---|---|---|
encryption |
AES-256 encryption support (NEW!) | aes, ctr, hmac, sha1, pbkdf2 |
async |
Enables async/await support with Tokio runtime | tokio, async-compression |
async-zstd |
Async + Zstd compression support | async, zstd-support |
zstd-support |
Zstd compression for sync API | zstd |
cloud-s3 |
AWS S3 + MinIO + S3-compatible services | async, aws-sdk-s3 |
cloud-gcs |
Google Cloud Storage adapter | async, google-cloud-storage |
cloud-all |
All cloud storage providers | cloud-s3, cloud-gcs |
Note: async-zstd includes both async and zstd-support features. Cloud features require async.
Reading a ZIP file
use StreamingZipReader;
Writing a ZIP file
use StreamingZipWriter;
Custom compression level
use StreamingZipWriter;
let mut writer = with_compression?; // Max compression
// ... add files ...
writer.finish?;
Using Zstd compression (requires zstd-support feature)
use ;
Note: Zstd compression provides better compression ratios than DEFLATE but may have slower decompression on some systems. The reader will automatically detect and decompress Zstd-compressed entries when the zstd-support feature is enabled.
Password Protection / AES-256 Encryption
s-zip supports WinZip-compatible AES-256 encryption to password-protect sensitive files in your ZIP archives. This feature is perfect for securing confidential data, credentials, or any sensitive information.
Encryption Features
- 🔐 AES-256-CTR encryption - Industry-standard strongest encryption
- 🔑 PBKDF2-HMAC-SHA1 key derivation (1000 iterations)
- ✅ HMAC-SHA1 authentication - Detects tampering and incorrect passwords
- 🌐 WinZip AE-2 format - Compatible with 7-Zip, WinZip, WinRAR, etc.
- 📁 Per-file passwords - Different passwords for different files in same archive
- 🚀 Streaming encryption - Encrypt on-the-fly with constant memory usage
Basic Encryption Example
use StreamingZipWriter;
Multiple Passwords in One Archive
You can use different passwords for different files in the same ZIP:
let mut writer = new?;
// Financial files with one password
writer.set_password;
writer.start_entry?;
writer.write_data?;
// Legal files with different password
writer.set_password;
writer.start_entry?;
writer.write_data?;
// Public files without password
writer.clear_password;
writer.start_entry?;
writer.write_data?;
writer.finish?;
Security Specifications
- Encryption: AES-256-CTR (Counter mode)
- Key Derivation: PBKDF2-HMAC-SHA1 with 1000 iterations
- Salt: 16 bytes (randomly generated per file)
- Authentication: HMAC-SHA1 (10-byte authentication code)
- Format: WinZip AE-2 (no CRC for better security)
- Compatibility: Works with 7-Zip, WinZip, WinRAR, Info-ZIP (with AES support)
Security Best Practices
- Use strong passwords: Minimum 12 characters with mixed case, numbers, symbols
- Different passwords for different security levels: Don't reuse passwords across files
- Store passwords securely: Use environment variables or secret management systems
- Verify integrity: The HMAC authentication ensures files haven't been tampered with
Performance Impact
Encryption adds overhead but maintains constant memory usage:
| File Size | Overhead | Throughput | Notes |
|---|---|---|---|
| 1 KB | ~80x slower | 8-10 MiB/s | Dominated by key derivation (~950µs) |
| 100 KB | ~23x slower | 20-23 MiB/s | Stable encryption overhead |
| 1 MB+ | ~24-31x slower | 17-23 MiB/s | Network/disk I/O becomes bottleneck |
Memory usage: ✅ No impact - maintains constant 2-5 MB streaming architecture
Best for: Backend services, large files, cloud storage (where network is the bottleneck)
Considerations: Real-time applications with <100ms latency requirements
📊 See ENCRYPTION_PERFORMANCE.md for detailed benchmarks
Decryption Support
Currently, decryption is not yet implemented in the reader. This is planned for future releases. For now, you can extract encrypted ZIPs using:
- 7-Zip:
7z x encrypted.zip - WinZip, WinRAR, or other tools that support WinZip AE-2 format
Async/Await Support
s-zip supports async/await with Tokio runtime, enabling non-blocking I/O for web servers and cloud applications.
When to Use Async?
✅ Use Async for:
- Web frameworks (Axum, Actix, Rocket)
- Cloud storage uploads (S3, GCS, Azure)
- Network streams (HTTP, WebSocket)
- Concurrent operations (multiple ZIPs simultaneously)
- Real-time applications
✅ Use Sync for:
- CLI tools and scripts
- Batch processing (single-threaded)
- Maximum throughput (CPU-bound tasks)
Async Writer Example
use AsyncStreamingZipWriter;
async
Async with In-Memory (Cloud Upload)
Perfect for HTTP responses or cloud storage:
use AsyncStreamingZipWriter;
use Cursor;
async
Streaming from Async Sources
Stream files directly without blocking:
use AsyncStreamingZipWriter;
use File;
use AsyncReadExt;
async
Async Reader
Read ZIP files asynchronously with minimal memory usage. Supports reading from local files, S3, HTTP, or any AsyncRead + AsyncSeek source.
use AsyncStreamingZipReader;
use AsyncReadExt;
async
Reading from S3 (NEW in v0.6.0!)
Read ZIP files directly from S3 without downloading to disk:
use ;
use Client;
async
Key Benefits:
- ✅ No local disk - Reads directly from S3 using byte-range GET requests
- ✅ Constant memory - ~5-10MB regardless of ZIP size
- ✅ Random access - Jump to any file without downloading entire ZIP
- ✅ Generic API - Works with any
AsyncRead + AsyncSeeksource (HTTP, in-memory, custom)
Performance Note: For small files (<50MB), downloading the entire ZIP first is faster due to network latency. For large archives or when reading only a few files, streaming from S3 provides significant memory savings.
Reading from HTTP/Custom Sources
The generic async reader works with any AsyncRead + AsyncSeek source:
use GenericAsyncZipReader;
use Cursor;
async
Concurrent ZIP Creation
Create multiple ZIPs simultaneously (5x faster than sequential):
use AsyncStreamingZipWriter;
use JoinSet;
async
Performance: Async vs Sync
| Scenario | Sync | Async | Advantage |
|---|---|---|---|
| Local disk (5MB) | 6.7ms | 7.1ms | ≈ Same (~6% overhead) |
| In-memory (100KB) | 146µs | 136µs | Async 7% faster |
| Network upload (5×50KB) | 1053ms | 211ms | Async 5x faster 🚀 |
| 10 concurrent operations | 70ms | 10-15ms | Async 4-7x faster 🚀 |
See PERFORMANCE.md for detailed benchmarks.
Cloud Storage Streaming
Stream ZIP files directly to/from AWS S3 or Google Cloud Storage without writing to local disk. Perfect for serverless, containers, and cloud-native applications.
AWS S3 Streaming (Write)
use ;
use Client;
async
Key Benefits:
- ✅ No local disk usage - Streams directly to S3
- ✅ Constant memory - ~5-10MB regardless of ZIP size
- ✅ S3 multipart upload - Handles files >5GB automatically
- ✅ Configurable part size - Default 5MB, customize up to 5GB
AWS S3 Streaming (Read - NEW in v0.6.0!)
Read ZIP files directly from S3 without downloading:
use ;
use Client;
async
Key Benefits:
- ✅ No local download - Uses S3 byte-range GET requests
- ✅ Constant memory - ~5-10MB for any ZIP size
- ✅ Random access - Read any file without downloading entire archive
- ✅ Cost effective - Only transfer bytes you need
Google Cloud Storage Streaming
use ;
use Client;
async
Key Benefits:
- ✅ No local disk usage - Streams directly to GCS
- ✅ Constant memory - ~8-12MB regardless of ZIP size
- ✅ Resumable upload - 8MB chunks (256KB aligned)
- ✅ Configurable chunk size - Customize for performance
Performance: Async Streaming vs Sync Upload
Real-world comparison on AWS S3 (20MB data):
| Method | Time | Memory | Description |
|---|---|---|---|
| Sync (in-memory + upload) | 368ms | ~20MB | Create ZIP in RAM, then upload |
| Async (direct streaming) | 340ms | ~10MB | Stream directly to S3 |
| Speedup | 1.08x faster | 50% less memory | ✅ Better for large files |
For 100MB+ files:
- 🚀 Async streaming: Constant 10MB memory
- ⚠️ Sync approach: 100MB+ memory (entire ZIP in RAM)
When to use cloud streaming:
- ✅ Serverless functions (Lambda, Cloud Functions)
- ✅ Containers with limited memory
- ✅ Large archives (>100MB)
- ✅ Cloud-native architectures
- ✅ ETL pipelines, data exports
MinIO / S3-Compatible Services (NEW in v0.7.0!)
Stream ZIPs directly to MinIO, Cloudflare R2, DigitalOcean Spaces, Backblaze B2, and other S3-compatible services:
use ;
async
Read from MinIO:
use ;
let reader = builder
.endpoint_url
.bucket
.key
.build
.await?;
let mut zip = new.await?;
let data = zip.read_entry_by_name.await?;
Supported S3-Compatible Services:
| Service | Endpoint Example |
|---|---|
| MinIO | http://localhost:9000 |
| Cloudflare R2 | https://<account_id>.r2.cloudflarestorage.com |
| DigitalOcean Spaces | https://<region>.digitaloceanspaces.com |
| Backblaze B2 | https://s3.<region>.backblazeb2.com |
| Linode Object Storage | https://<region>.linodeobjects.com |
Advanced S3 Configuration
use S3ZipWriter;
// Custom part size for large files
let writer = builder
.client
.bucket
.key
.part_size // 100MB parts for huge files
.build
.await?;
// Or with custom endpoint for S3-compatible services
let writer = builder
.endpoint_url
.region
.bucket
.key
.build
.await?;
See examples:
- examples/cloud_s3.rs - S3 streaming example
- examples/async_vs_sync_s3.rs - Performance comparison
Using Arbitrary Writers (Advanced)
s-zip supports writing to any type that implements Write + Seek, not just files. This enables:
- In-memory ZIP creation (Vec, Cursor)
- Network streaming (TCP streams with buffering)
- Custom storage backends (S3, databases, etc.)
use StreamingZipWriter;
use Cursor;
⚠️ IMPORTANT - Memory Usage by Writer Type:
| Writer Type | Memory Usage | Best For |
|---|---|---|
File (StreamingZipWriter::new(path)) |
✅ ~2-5 MB constant | Large files, production use |
| Network streams (TCP, pipes) | ✅ ~2-5 MB constant | Streaming over network |
Vec/Cursor (from_writer()) |
⚠️ ENTIRE ZIP IN RAM | Small archives only (<100MB) |
⚠️ Critical Warning for Vec/Cursor:
When using Vec<u8> or Cursor<Vec<u8>> as the writer, the entire compressed ZIP file will be stored in memory. While the compressor still uses only ~2-5MB for its internal buffer, the final output accumulates in the Vec. Only use this for small archives or when you have sufficient RAM.
Recommended approach for large files:
- Use
StreamingZipWriter::new(path)to write to disk (constant ~2-5MB memory) - Use network streams for real-time transmission
- Reserve
Vec<u8>/Cursorfor small temporary ZIPs (<100MB)
The implementation uses a 1MB buffer threshold to periodically flush compressed data to the writer, keeping compression memory low (~2-5MB) for all writer types. However, in-memory writers like Vec<u8> will still accumulate the full output.
See examples/arbitrary_writer.rs for more examples.
Supported Compression Methods
| Method | Description | Default | Feature Flag | Best For |
|---|---|---|---|---|
| DEFLATE (8) | Standard ZIP compression | ✓ | Always available | Text, source code, JSON, XML, CSV, XLSX |
| Stored (0) | No compression | - | Always available | Already compressed files (JPG, PNG, MP4, PDF) |
| Zstd (93) | Modern compression algorithm | - | zstd-support |
All text/data files, logs, databases |
Compression Method Selection Guide
Use DEFLATE (default) when:
- ✅ Maximum compatibility required (all ZIP tools support it)
- ✅ Working with: text files, source code, JSON, XML, CSV, HTML, XLSX
- ✅ Standard ZIP format compliance needed
Use Zstd when:
- ⚡ Best performance: 3.3x faster compression, 11-27x better compression ratio
- ✅ Working with: server logs, database dumps, repetitive data, large text files
- ✅ Backend/internal systems (don't need old tool compatibility)
- ✅ Processing large volumes of data
Use Stored (no compression) when:
- ✅ Files are already compressed: JPEG, PNG, GIF, MP4, MOV, PDF, ZIP, GZ
- ✅ Need fastest possible archive creation
- ✅ CPU resources are limited
Performance Benchmarks
s-zip includes comprehensive benchmarks to compare compression methods:
# Run all benchmarks with Zstd support
# Or run individual benchmark suites
Benchmarks measure:
- Compression speed: Write throughput for different compression methods and levels
- Decompression speed: Read throughput for various compressed formats
- Data patterns: Highly compressible text, random data, and mixed workloads
- File sizes: From 1KB to 10MB to test scaling characteristics
- Multiple entries: Performance with 100+ files in a single archive
Results are saved to target/criterion/ with HTML reports showing detailed statistics, comparisons, and performance graphs.
Quick Comparison Results
File Size (1MB Compressible Data)
| Method | Compressed Size | Ratio | Speed |
|---|---|---|---|
| DEFLATE level 6 | 3.16 KB | 0.31% | ~610 MiB/s |
| DEFLATE level 9 | 3.16 KB | 0.31% | ~494 MiB/s |
| Zstd level 3 | 281 bytes | 0.03% | ~2.0 GiB/s ⚡ |
| Zstd level 10 | 358 bytes | 0.03% | ~370 MiB/s |
Key Insights:
- ✅ Zstd level 3 is 11x smaller and 3.3x faster than DEFLATE on repetitive data
- ✅ For 10MB data: Zstd = 1.12 KB vs DEFLATE = 29.97 KB (27x better!)
- ✅ Random data: All methods ~100% (both handle incompressible data efficiently)
- ✅ Memory: ~2-5 MB constant regardless of file size
- ✅ CPU: Zstd level 3 uses less CPU than DEFLATE level 9
💡 Recommendation: Use Zstd level 3 for best performance and compression. Only use DEFLATE when compatibility with older tools is required.
📊 Full Analysis: See BENCHMARK_RESULTS.md for detailed performance data including:
- Complete speed benchmarks (1KB to 10MB)
- Memory profiling
- CPU usage analysis
- Multiple compression levels comparison
- Random vs compressible data patterns
Migration Guide
Upgrading from v0.6.x to v0.7.0
Zero Breaking Changes! The v0.7.0 release is fully backward compatible.
What's New:
- 🔐 AES-256 encryption support (opt-in via
encryptionfeature) - 🔑 Password-protect files with WinZip-compatible AE-2 format
- 🚀 Streaming encryption with constant memory usage (~2-5 MB)
- 📁 Per-file passwords in same archive
- ✅ All existing code works unchanged
Migration:
[]
# Just update the version - existing code works as-is!
= "0.7"
# Or add encryption support
= { = "0.7", = ["encryption"] }
New APIs (Optional):
// Enable encryption for files
let mut writer = new?;
writer.set_password;
writer.start_entry?;
writer.write_data?;
// Mix encrypted and unencrypted files
writer.clear_password;
writer.start_entry?;
writer.write_data?;
writer.finish?;
Upgrading from v0.5.x to v0.6.0
Zero Breaking Changes! The v0.6.0 release is fully backward compatible.
What's New:
- ✅ Generic async ZIP reader (
GenericAsyncZipReader<R>) - ✅ Read ZIPs from any
AsyncRead + AsyncSeeksource (S3, HTTP, in-memory, files) - ✅ S3ZipReader for direct S3 streaming reads
- ✅ Unified architecture - eliminated duplicate code
- ✅ All existing sync and async code works unchanged
Migration:
[]
# Just update the version - existing code works as-is!
= "0.7"
# Or with features
= { = "0.7", = ["async", "cloud-s3"] }
New APIs (Optional):
// v0.5.x - Still works!
let mut reader = open.await?;
// v0.6.0+ - Read from S3
let s3_reader = new.await?;
let mut reader = new.await?;
// v0.6.0+ - Read from any source
let mut reader = new.await?;
Upgrading from v0.4.x to v0.5.0
Zero Breaking Changes! The v0.5.0 release is fully backward compatible.
What's New:
- ✅ AWS S3 streaming support (opt-in via
cloud-s3feature) - ✅ Google Cloud Storage support (opt-in via
cloud-gcsfeature) - ✅ Direct cloud upload without local disk usage
- ✅ Constant memory usage for cloud uploads (~5-10MB)
- ✅ All existing sync and async code works unchanged
Migration Options:
Option 1: Keep Using Existing Code (No Changes)
[]
= "0.5" # Existing code works as-is
Your existing code continues to work exactly as before!
Option 2: Add Cloud Storage Support
[]
# AWS S3 only
= { = "0.5", = ["cloud-s3"] }
# Google Cloud Storage only
= { = "0.5", = ["cloud-gcs"] }
# Both S3 and GCS
= { = "0.5", = ["cloud-all"] }
API Comparison:
// Local file (v0.4.x and later)
let mut writer = new.await?;
writer.start_entry.await?;
writer.write_data.await?;
writer.finish.await?;
// AWS S3 (v0.5.0+)
let s3_writer = new.await?;
let mut writer = from_writer;
writer.start_entry.await?;
writer.write_data.await?;
writer.finish.await?;
Upgrading from v0.3.x to v0.4.0+
All v0.3.x code is compatible with v0.7.0. Just update the version number and optionally add new features.
Examples
Check out the examples/ directory for complete working examples:
Sync Examples:
- basic.rs - Simple ZIP creation
- arbitrary_writer.rs - In-memory ZIPs
- zstd_compression.rs - Zstd compression
Encryption Examples:
- encryption_basic.rs - Basic password protection (NEW!)
- encryption_advanced.rs - Multiple passwords per archive (NEW!)
Async Examples:
- async_basic.rs - Basic async usage
- async_streaming.rs - Stream files to ZIP
- async_in_memory.rs - Cloud upload simulation
- async_reader_advanced.rs - Advanced async reading (NEW!)
- async_http_reader.rs - Read from HTTP/in-memory (NEW!)
- concurrent_demo.rs - Concurrent creation
- network_simulation.rs - Network I/O demo
Cloud Storage Examples:
- cloud_s3.rs - AWS S3 streaming upload
- async_vs_sync_s3.rs - Performance comparison (upload + download)
- verify_s3_upload.rs - Verify S3 uploads
Run examples:
# Sync examples
# Encryption examples
# Async examples
# Cloud storage examples (requires AWS credentials)
Documentation
- API Documentation: https://docs.rs/s-zip
- Performance Benchmarks: PERFORMANCE.md
- Benchmark Results: BENCHMARK_RESULTS.md
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Author
Ton That Vu - @KSD-CO