Shoebox

A local S3-compatible server for your files. Find duplicates, verify integrity, zero config.

Shoebox webapp — browsing a bucket

Install

# Docker (recommended)
docker pull ghcr.io/deepjoy/shoebox:latest

# Or via Cargo
cargo install shoebox

Quick Start

# Point Shoebox at a directory
shoebox ~/Photos

# Or with Docker
docker run -it --rm -p 9000:9000 -v ~/Photos:/photos ghcr.io/deepjoy/shoebox /photos

# Output:
# Serving 1 bucket on http://localhost:9000
#   photos → /home/user/Photos

Files already on disk appear in S3 immediately — no uploading required. Credentials are generated on first run and printed in the output. To enable browser access (CORS), follow the on-screen instructions — or use the AWS CLI:

# Configure credentials (printed on first run)
aws configure --profile shoebox

# List objects
aws --profile shoebox --endpoint-url http://localhost:9000 s3 ls s3://photos/

Features

S3-compatible API — works with AWS CLI, rclone, and any S3 SDK out of the box
Zero-config startup — just point at directories, no cloud account or configuration needed
Duplicate detection — find and merge duplicate files and directories via content hashing
Integrity verification — scheduled checks to detect bit rot and data corruption
Filesystem sync — background scanning with move detection, real-time file watching
Authentication — AWS Signature V4, per-bucket credentials, pre-signed URLs
Multipart uploads — full support for large file uploads
CORS — browser-based clients work out of the box
Webhook notifications — get notified on object events (put, delete, copy)
Single binary, ~18MB — no runtime dependencies

Duplicate Detection

Shoebox hashes every file (SHA-256) in the background. Finding duplicates is a query:

$ shoebox duplicates ~/Photos --format table

Duplicate groups (2 groups, 5 files, 3 duplicates):

  Hash (SHA-256)       Size   Files
  ─────────────────────────────────────────────
  a]3f…c8d1            32 B   3 copies
    originals/sunset.txt
    backup/sunset.txt        ← duplicate
    edited/sunset-copy.txt   ← duplicate

  7b2e…f104            26 B   2 copies
    originals/mountain.txt
    backup/mountain.txt      ← duplicate

Webapp

A companion browser UI is available at https://deepjoy.github.io/shoebox-webapp/.

Browse buckets, view objects, and see duplicate groups visually — no CLI needed. The webapp talks directly to your local Shoebox server via the S3 API.

CORS setup (required for browser access) — Shoebox prints this command on startup, just copy and run it:

export AWS_ACCESS_KEY_ID='<from startup output>'
export AWS_SECRET_ACCESS_KEY='<from startup output>'
export BUCKET='photos'

curl -X PUT "http://localhost:9000/${BUCKET}?cors" \
  --aws-sigv4 "aws:amz:us-east-1:s3" \
  --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \
  -H "Content-Type: application/json" \
  -d '[{"allowed_origins":["*"],"allowed_methods":["GET","PUT","POST","DELETE","HEAD"],"allowed_headers":["*"],"expose_headers":["ETag","x-amz-request-id"],"max_age_seconds":3600}]'

Who It's For

Developers — test S3 integrations without cloud dependencies, work offline
Home users — expose NAS storage to S3-compatible backup tools, find duplicates with a single query
Archivists — verify file integrity with content hashes, detect bit rot
Privacy-conscious users — keep files local, no account required, no telemetry

Comparison

Concern	Cloud S3	MinIO	SeaweedFS	Garage	Shoebox
Primary strength	Scalability, AWS ecosystem	High performance, enterprise	Small files, high throughput	Simplicity, geo-replication	Existing files, zero config
Best for	Production workloads	AI/ML, large data (TB/PB)	Data lakes, file storage	Edge/distributed, low ops	Local dev, NAS, home lab
Architecture	Managed service	Specialized nodes	Master/volume servers	Homogeneous nodes	Single process
Setup	Account + IAM	Docker + config	Docker + config	Docker + config	Single command
Data location	Cloud	MinIO data dir	SeaweedFS volumes	Garage data dir	Your existing files
File visibility	S3 only	S3 only	S3, FUSE, WebDAV	S3 only	Filesystem + S3
Offline use	No	Yes	Yes	Yes	Yes
Binary size	N/A	~100MB	~40MB	~25MB	~18MB
Duplicate detection	No	No	No	No	Built-in
Integrity checks	Yes (default checksums)	Yes (bitrot healing)	Limited (CRC)	Yes (scrub)	Built-in (scheduled)
Max recommended scale	Unlimited	Petabytes	Petabytes	Petabytes	~10TB

See docs/why-shoebox.md for the full story.

When Not to Use Shoebox

See docs/when-not-to-use-shoebox.md for an honest assessment of limitations, including:

Strong consistency requirements
Distributed / multi-node storage
>10TB of data
Enterprise S3 features (object lock, lifecycle policies, versioning)
High-throughput ingestion (thousands of files/second)

Documentation

Quickstart — Running in 5 minutes
Installation — Docker, cargo install, from source
User Guides — Configuration, credentials, S3 compatibility, and more

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Security

See SECURITY.md for the security model and how to report vulnerabilities.

License

MIT

Background

I had 2TB of photos across 3 drives — backups of backups, originals I was afraid to delete. I set out to find duplicate photos and accidentally designed a local S3 server. If an object store knows the content hash of every file, duplicates are just a query. This is a personal project built in public — expect breaking changes before 1.0. If you have thoughts on the approach, open an issue or start a discussion.

shoebox 0.3.8