s3find
A powerful command line utility to walk an Amazon S3 hierarchy. Think of it as the find command but specifically designed for Amazon S3.
Table of Contents
Installation
Pre-built Binaries
Github Release page provides ready-to-use binaries for:
- Windows (x86_64)
- Linux (x86_64 and ARM)
- macOS (x86_64 and ARM)
Binaries for both architectures allow you to run s3find natively on Intel-based and ARM-based machines (like Apple M1/M2/M3/M4, AWS Graviton, and Raspberry Pi).
Build from Source
Requirements: Rust and Cargo
# Build
# Install from local source
# Install latest from git
# Install from crates.io
Usage
Basic Syntax
Where:
<s3-path>is formatted ass3://bucket/path[OPTIONS]are filters and controls[COMMAND]is the action to perform on matched objects
Authentication Methods
s3find supports multiple AWS authentication methods in the following priority:
- Command-line credentials (
--aws-access-keyand--aws-secret-key) - Environment variables (
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY) - AWS profile credentials file (configurable via
AWS_PROFILEandAWS_SHARED_CREDENTIALS_FILE) - AWS instance IAM profile
- AWS container IAM profile
Using with Non-AWS S3-Compatible Services
s3find supports self-hosted and third-party S3-compatible services such as MinIO, Ceph, and others. Use the following options to connect to these services:
--endpoint-url <URL>- Custom S3 endpoint URL--force-path-style- Use path-style bucket addressing (required for most non-AWS S3 services)
MinIO Example
Ceph Example
Command Line Reference
) for
)
)
<path>
)
)
)
)
)
)
)
Examples
Finding Files by Glob Pattern
Use the --name option with glob patterns to match objects:
# Find all objects in a path
# Find objects with specific extension
Output Formats
The print command supports different output formats:
# Default format
# Text format
# JSON format
# CSV format
Case Insensitive Search
Use --iname for case-insensitive glob pattern matching:
Regex Pattern Matching
Use --regex for regular expression pattern matching:
# Find objects ending with a number
Find Path by Size
# Exact match - files exactly 0 bytes
# Larger than 10 megabytes
# Smaller than 10 kilobytes
Find Path by Time
# Files modified in the last 10 seconds
# Files modified more than 10 minutes ago
# Files modified in the last 10 hours
Object Storage Class Filter
Filter objects by their storage class:
# Find objects in STANDARD storage class
# Find objects in GLACIER storage class
Filter by Tags
Filter objects based on their S3 tags using --tag for key-value matching or --tag-exists for key presence:
# Find objects with a specific tag key and value
# Find objects with multiple tags (AND logic - all must match)
# Find objects that have a specific tag key (any value)
# Combine tag filters with other filters
# Control concurrency for tag fetching (default: 50)
Required IAM Permissions:
Tag filtering requires the s3:GetObjectTagging permission on the objects being searched.
Multiple Filters
Combine filters to create more specific queries:
# Files between 10 and 20 bytes
# Combine different filter types
Actions and Operations
Delete Objects
List Objects and Tags
# List objects
# List objects with their tags
Execute Commands on Objects
Download Objects
Copy and Move Operations
# Copy files to another location
# Move files to another location
Tag Management
Make Objects Public
Additional Control
Control the number of results and request behavior:
# Limit to first 10 matching objects
# Control page size for S3 API requests
Depth Control
Limit how deep s3find descends into the object hierarchy:
# Only objects at the bucket root level (no subdirectories)
# Objects up to one subdirectory level deep
# Objects up to two levels deep
The --maxdepth option uses S3's delimiter-based traversal for efficient server-side filtering, avoiding the need to fetch objects beyond the specified depth.
Object Versioning
List all versions of objects in versioned buckets:
# List all versions of all objects
# List all versions matching a pattern
# Print all versions in JSON format
When --all-versions is enabled:
- Uses the S3 ListObjectVersions API instead of ListObjectsV2
- Shows all versions of each object, not just the current version
- Includes delete markers (shown with size 0)
- Each entry displays the version ID alongside the key (e.g.,
file.txt?versionId=abc123); the underlying S3 key remains unchanged - The latest version is marked with
(latest) - Delete markers are marked with
(delete marker)
Version-aware operations: When using --all-versions, operations work on specific versions:
delete- Deletes specific object versions (not just the current version)copy/move- Copies or moves specific versions to the destinationdownload- Downloads specific versions of objects
Note: Delete markers are automatically skipped for operations that don't support them (copy, move, download, tags, restore, change-storage, public) since they have no content.
Note: --all-versions is not compatible with --maxdepth. If both are specified, --all-versions takes precedence and --maxdepth is ignored.
Tag Filtering Performance
Tag filtering requires an individual GetObjectTagging API call for each object that passes the other filters. To optimize performance and cost:
-
Apply other filters first: Use
--name,--mtime,--bytes-size, or--storage-classfilters to reduce the number of objects before tag filtering is applied. -
Use
--limit: When testing or exploring, use--limitto cap the number of objects processed. -
Adjust concurrency: Use
--tag-concurrencyto tune parallel API calls (default: 50). Higher values increase throughput but may cause throttling. -
Cost awareness:
GetObjectTaggingrequests incur S3 API charges that vary by region. For large buckets with millions of objects, apply filters to reduce the number of tag fetch operations. Refer to the AWS S3 pricing documentation for current rates.
Example: Optimized tag filtering
# Bad: Fetches tags for ALL objects (expensive for large buckets)
# Good: Apply cheap filters first, then tag filter
# Good: Use limit when exploring
When --summarize is enabled, tag fetch statistics are displayed showing success/failure counts.
For more information, see the GitHub repository.