docker-image-pusher 0.3.3

A memory-optimized Docker image transfer tool for handling large images efficiently
# Docker Image Pusher

A memory-optimized Docker image transfer tool designed to handle large Docker images without excessive memory usage. This tool addresses the common problem of memory exhaustion when pulling or pushing multi-gigabyte Docker images.

## 🎯 Problem Statement

Traditional Docker image tools often load entire layers into memory, which can cause:
- **Memory exhaustion** with large images (>1GB)
- **System instability** when processing multiple large layers
- **Failed transfers** due to insufficient RAM
- **Poor performance** on resource-constrained systems

## 🚀 Solution

This tool implements **streaming-based layer processing** using the OCI client library:

- **Streaming Downloads**: Layers are streamed directly to disk without loading into memory
-**Sequential Processing**: Processes one layer at a time to minimize memory footprint  
-**Chunked Uploads**: Large layers (>100MB) are read in 50MB chunks during upload
-**Local Caching**: Efficient caching system for faster subsequent operations
-**Progress Monitoring**: Real-time feedback on transfer progress and layer sizes

## 📋 Prerequisites

- **Rust**: Version 1.70 or later
- **Network Access**: To source and target registries
- **Disk Space**: Sufficient space for caching large images

## 🛠️ Installation

### From Source

```bash
git clone <repository-url>
cd docker-image-pusher
cargo build --release
```

The compiled binary will be available at `target/release/docker-image-pusher.exe`

## 📖 Usage

### Basic Commands

#### Pull and Cache an Image

```bash
docker-image-pusher pull <source-image>
```

**Examples:**
```bash
# Pull from Docker Hub
docker-image-pusher pull nginx:latest

# Pull from private registry  
docker-image-pusher pull registry.example.com/app:v1.0

# Pull large image (this is where memory optimization shines)
docker-image-pusher pull registry.cn-beijing.aliyuncs.com/yoce/vllm-openai:v0.9.0.1
```

#### Push Cached Image to Registry

```bash
docker-image-pusher push <source-image> <target-image> --username <user> --password <pass>
```

**Examples:**
```bash
# Push to Docker Hub
docker-image-pusher push nginx:latest myregistry/nginx:latest --username myuser --password mypass

# Push to private registry
docker-image-pusher push app:v1.0 registry.company.com/app:v1.0 --username deploy --password secret
```

### Advanced Usage

#### Environment Variables

You can also set credentials via environment variables:
```bash
export DOCKER_USERNAME=myuser
export DOCKER_PASSWORD=mypass
docker-image-pusher push nginx:latest myregistry/nginx:latest --username $DOCKER_USERNAME --password $DOCKER_PASSWORD
```

## 🏗️ Architecture

### Memory Optimization Strategy

```
Traditional Approach (High Memory):
[Registry] → [Full Image in Memory] → [Local Storage]
❌ Memory usage scales with image size
❌ Can exceed available RAM with large images

Optimized Approach (Low Memory):  
[Registry] → [Stream Layer by Layer] → [Local Storage]
✅ Constant memory usage regardless of image size
✅ Handles multi-GB images efficiently
```

### Cache Structure

Images are cached in `.cache/` directory with the following structure:

```
.cache/
└── {sanitized_image_name}/
    ├── index.json              # Metadata and layer list
    ├── manifest.json           # OCI image manifest
    ├── config_{digest}.json    # Image configuration
    ├── {layer_digest_1}        # Layer file 1
    ├── {layer_digest_2}        # Layer file 2
    └── ...                     # Additional layers
```

### Processing Flow

#### Pull Operation:
1. **Fetch Manifest** - Download image metadata (~1-5KB)
2. **Create Cache Structure** - Set up local directories
3. **Stream Layers** - Download each layer directly to disk
4. **Cache Metadata** - Store manifest and configuration
5. **Create Index** - Generate lookup metadata

#### Push Operation:
1. **Authenticate** - Connect to target registry
2. **Read Cache** - Load cached image metadata
3. **Upload Layers** - Transfer layers with size-based optimization
4. **Upload Config** - Transfer image configuration
5. **Push Manifest** - Complete the image transfer

### Layer Processing Strategies

| Layer Size | Strategy | Memory Usage | Description |
|------------|----------|--------------|-------------|
| < 100MB | Direct Read | ~Layer Size | Read entire layer into memory |
| > 100MB | Chunked Read | ~50MB | Read in 50MB chunks with delays |
| Any Size | Streaming | ~Buffer Size | Direct stream to/from disk |

## 🔧 Configuration

### Client Configuration

The tool uses these default settings:

```rust
// Platform resolver for multi-arch images
platform_resolver = linux_amd64_resolver

// Authentication methods
- Anonymous (for public registries)
- Basic Auth (username/password)

// Chunk size for large layers
chunk_size = 50MB

// Rate limiting delays
large_layer_delay = 200ms
chunk_delay = 10ms
```

### Customization

You can modify these settings in `src/main.rs`:

```rust
// Adjust chunk size for very large layers
let chunk_size = 100 * 1024 * 1024; // 100MB chunks

// Modify size threshold for chunked processing  
if layer_size_mb > 50.0 { // Lower threshold
    // Use chunked approach
}

// Adjust rate limiting
tokio::time::sleep(tokio::time::Duration::from_millis(500)).await; // Longer delay
```

## 📊 Performance Comparison

### Memory Usage (Processing 5GB Image)

| Method | Peak Memory | Notes |
|--------|-------------|-------|
| Traditional Docker | ~5.2GB | Loads layers into memory |
| **This Tool** | ~50MB | Streams with chunked processing |

### Transfer Speed

- **Network bound**: Performance limited by network speed
- **Consistent memory**: No memory-related slowdowns
- **Parallel-safe**: Can run multiple instances without memory conflicts

## 🐛 Troubleshooting

### Common Issues

#### "Authentication failed"
```bash
Error: Push error: Authentication failed: ...
```
**Solution**: Verify username/password and registry permissions

#### "Cache not found"  
```bash
Error: Cache not found
```
**Solution**: Run `pull` command first to cache the image

#### "Failed to create cache directory"
```bash
Error: Cache error: Failed to create cache directory: ...
```
**Solution**: Check disk space and write permissions

#### Memory Issues (Still occurring)
If you're still experiencing memory issues:

1. **Check chunk size**: Reduce chunk size in code
2. **Monitor disk space**: Ensure sufficient space for caching
3. **Close other applications**: Free up system memory
4. **Use sequential processing**: Avoid concurrent operations

### Debug Mode

Add debug logging by setting environment variable:
```bash
RUST_LOG=debug docker-image-pusher pull nginx:latest
```

## 🤝 Contributing

### Development Setup

```bash
git clone <repository-url>
cd docker-image-pusher
cargo build
cargo test
```

### Code Structure

- `main.rs` - Main application entry point and CLI handling
- `cache_image()` - Pull and caching logic with streaming
- `push_cached_image()` - Push logic with memory optimization
- `PusherError` - Custom error types for better error handling

### Adding Features

1. **New authentication methods**: Extend `RegistryAuth` usage
2. **Progress bars**: Add progress indication for long transfers
3. **Compression**: Add layer compression/decompression support
4. **Parallel processing**: Implement safe concurrent layer transfers

## 📄 License

[Add your license information here]

## 🔗 Dependencies

- **oci-client**: OCI registry client with streaming support
- **tokio**: Async runtime for concurrent operations
- **clap**: Command-line argument parsing
- **serde_json**: JSON serialization for metadata
- **thiserror**: Structured error handling

## 📈 Future Enhancements

- [ ] Progress bars for long transfers
- [ ] Resume interrupted transfers
- [ ] Compression optimization
- [ ] Multi-registry synchronization
- [ ] Garbage collection for cache
- [ ] Configuration file support
- [ ] Integration with CI/CD pipelines

---

**Happy Docker image transferring! 🐳**