Shoebox
A local S3-compatible server for your files. Find duplicates, verify integrity, zero config.

Install
# Docker (recommended)
# Or via Cargo
Quick Start
# Point Shoebox at a directory
# Or with Docker
# Output:
# Serving 1 bucket on http://localhost:9000
# photos → /home/user/Photos
Files already on disk appear in S3 immediately — no uploading required. Credentials are generated on first run and printed in the output. To enable browser access (CORS), follow the on-screen instructions — or use the AWS CLI:
# Configure credentials (printed on first run)
# List objects
Features
- S3-compatible API — works with AWS CLI, rclone, and any S3 SDK out of the box
- Zero-config startup — just point at directories, no cloud account or configuration needed
- Duplicate detection — find and merge duplicate files and directories via content hashing
- Integrity verification — scheduled checks to detect bit rot and data corruption
- Filesystem sync — background scanning with move detection, real-time file watching
- Authentication — AWS Signature V4, per-bucket credentials, pre-signed URLs
- Multipart uploads — full support for large file uploads
- CORS — browser-based clients work out of the box
- Webhook notifications — get notified on object events (put, delete, copy)
- Single binary, ~18MB — no runtime dependencies
Duplicate Detection
Shoebox hashes every file (SHA-256) in the background. Finding duplicates is a query:
)
)
Webapp
A companion browser UI is available at https://deepjoy.github.io/shoebox-webapp/.
Browse buckets, view objects, and see duplicate groups visually — no CLI needed. The webapp talks directly to your local Shoebox server via the S3 API.
CORS setup (required for browser access) — Shoebox prints this command on startup, just copy and run it:
Who It's For
- Developers — test S3 integrations without cloud dependencies, work offline
- Home users — expose NAS storage to S3-compatible backup tools, find duplicates with a single query
- Archivists — verify file integrity with content hashes, detect bit rot
- Privacy-conscious users — keep files local, no account required, no telemetry
Comparison
| Concern | Cloud S3 | MinIO | SeaweedFS | Garage | Shoebox |
|---|---|---|---|---|---|
| Primary strength | Scalability, AWS ecosystem | High performance, enterprise | Small files, high throughput | Simplicity, geo-replication | Existing files, zero config |
| Best for | Production workloads | AI/ML, large data (TB/PB) | Data lakes, file storage | Edge/distributed, low ops | Local dev, NAS, home lab |
| Architecture | Managed service | Specialized nodes | Master/volume servers | Homogeneous nodes | Single process |
| Setup | Account + IAM | Docker + config | Docker + config | Docker + config | Single command |
| Data location | Cloud | MinIO data dir | SeaweedFS volumes | Garage data dir | Your existing files |
| File visibility | S3 only | S3 only | S3, FUSE, WebDAV | S3 only | Filesystem + S3 |
| Offline use | No | Yes | Yes | Yes | Yes |
| Binary size | N/A | ~100MB | ~40MB | ~25MB | ~18MB |
| Duplicate detection | No | No | No | No | Built-in |
| Integrity checks | Yes (default checksums) | Yes (bitrot healing) | Limited (CRC) | Yes (scrub) | Built-in (scheduled) |
| Max recommended scale | Unlimited | Petabytes | Petabytes | Petabytes | ~10TB |
See docs/why-shoebox.md for the full story.
When Not to Use Shoebox
See docs/when-not-to-use-shoebox.md for an honest assessment of limitations, including:
- Strong consistency requirements
- Distributed / multi-node storage
- >10TB of data
- Enterprise S3 features (object lock, lifecycle policies, versioning)
- High-throughput ingestion (thousands of files/second)
Documentation
- Quickstart — Running in 5 minutes
- Installation — Docker, cargo install, from source
- User Guides — Configuration, credentials, S3 compatibility, and more
Contributing
See CONTRIBUTING.md for development setup and guidelines.
Security
See SECURITY.md for the security model and how to report vulnerabilities.
License
MIT
Background
I had 2TB of photos across 3 drives — backups of backups, originals I was afraid to delete. I set out to find duplicate photos and accidentally designed a local S3 server. If an object store knows the content hash of every file, duplicates are just a query. This is a personal project built in public — expect breaking changes before 1.0. If you have thoughts on the approach, open an issue or start a discussion.