s3sync
Overview
s3sync is a reliable, very fast, and powerful synchronization tool for S3.
This document is a summary of s3sync. For more detailed information, please refer to the full README.
Who is this for?
s3sync is designed for users who need to synchronize data with S3 or S3-compatible object storage.
This tool is specifically tailored for those who require reliable synchronization capabilities and get evidence of data integrity.
As a Rust library
s3sync can be used as a Rust library.
s3sync CLI is a very thin wrapper of the s3sync library. You can use all features of s3sync CLI in the library.
See docs.rs for more information.
Features highlights
-
Reliable: In-depth end-to-end object integrity check
s3sync calculates ETag(MD5 or equivalent) for each object and compares them with the ETag in the target.
An object that exists in the local disk is read from the disk and compared with the checksum in the source or target.
Optionally, s3sync can also calculate and compare additional checksum(SHA256/SHA1/CRC32/CRC32C/CRC64NVME) for each object.s3sync always shows the integrity check result, so you can verify that the synchronization was successful.
| | )transferred 100 objects | 100 objects/sec, etag verified 100 objects, checksum verified 100 objectsmeans that all objects have been transferred and ETag(MD5 or equivalent) and additional checksum(SHA256 in this case) have been verified successfully.
If you want to get detailed evidence of the integrity check, you can useSync statistics reportfeature(see below). -
Sync statistics report
s3sync can check and report the synchronization status at any time.
For example, If you want to know all the objects transferred by AWS CLI(of course, you can use s3sync) have been transferred correctly(checksum based), the following command will show the report.| |The following is an example of the report (the last two lines of the above command).
You can check the synchronization status of the object's tagging and metadata with
--report-metadata-sync-statusand--report-tagging-sync-statusoption. -
Easy to use
s3sync is designed to be easy to use.
s3sync has many options, but the default settings are reasonable for most cases of reliable synchronization.For example, In the IAM role environment, you can use the following command to synchronize a local directory with an S3 bucket.
If something goes wrong, s3sync will show a warning or error message, so you can understand what went wrong. -
Very fast
s3sync implemented in Rust, using AWS SDK for Rust that uses multithreaded asynchronous I/O.
In my environment(c7a.large, with 256 workers), Local to S3, about 3,900 objects/sec (small objects 10KiB). -
Multiple ways
- Local to S3(S3-compatible storage)
- S3(S3-compatible storage) to Local
- S3 to S3(cross-region, same-region, same-account, cross-account, from-to S3/S3-compatible storage)
-
Multiple platforms support
Linux(x86_64, aarch64), macOS, Windows(x86_64, aarch64) are fully tested and supported.
s3sync ia a single binary with no dependencies, so it can be easily run on above platforms. -
Incremental transfer
There are many ways to transfer objects:- Modified time based(default)
- ETag(MD5 or equivalent) based
- Additional checksum(SHA256/SHA1/CRC32/CRC32C/CRC64NVME) based
-
Flexible filtering
- key,
ContentType, user-defined metadata, tagging, by regular expression. - size, modified time
- key,
-
Versioning support
All versions of the object can be synchronized. (Except intermediate delete markers) -
Point-in-time snapshot
With versioning enabled S3 bucket, you can transfer objects at a specific point in time. -
Amazon S3 Express One Zone(Directory bucket) support
s3sync can be used with Amazon S3 Express one Zone.
More information
For more information, please refer to the full README