gcs-rsync
Lightweight and efficient Rust gcs rsync for Google Cloud Storage.
gcs-rsync is faster than gsutil rsync according to the following benchmarks.
no hard limit to 32K objects or specific conf to compute state.
How to install as crate
Cargo.toml
How to install as cli tool
How to run with docker
Mirror local folder to gcs
Mirror gcs to folder
Include or Exclude files using glob pattern
CLI gcs-rsync
-i (include glob pattern) and -x (exclude glob pattern) multiple times.
An example where any json or toml are included recursively except any test.json or test.toml recursively
Library
with_includes and with_excludes client builders are used to fill includes and excludes glob patterns.
Benchmark
Important note about gsutil: The gsutil ls command does not list all object items by default but instead list all prefixes while adding the -r flag slowdown gsutil performance. The ls performance command is very different to the rsync implementation.
new files only (first time sync)
- gcs-rsync: 2.2s/7MB
- gsutil: 9.93s/47MB
winner: gcs-rsync
gcs-rsync sync bench
&& &&
real 2.20
user 0.13
sys 0.21
7606272 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
1915 page reclaims
0 page faults
0 swaps
0 block input operations
0 block output operations
394 messages sent
1255 messages received
0 signals received
54 voluntary context switches
5814 involuntary context switches
636241324 instructions retired
989595729 cycles elapsed
3895296 peak memory footprint
gsutil sync bench
&& &&
Operation completed over 215 objects/50.3 KiB.
real 9.93
user 8.12
sys 2.35
47108096 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
196391 page reclaims
1 page faults
0 swaps
0 block input operations
0 block output operations
36089 messages sent
87309 messages received
5 signals received
38401 voluntary context switches
51924 involuntary context switches
12986389 instructions retired
12032672 cycles elapsed
593920 peak memory footprint
no change (second time sync)
- gcs-rsync: 0.78s/8MB
- gsutil: 2.18s/47MB
winner: gcs-rsync (due to size and mtime check before crc32c like gsutil does)
gcs-rsync sync bench
&&
real 1.79
user 0.13
sys 0.12
7864320 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
1980 page reclaims
0 page faults
0 swaps
0 block input operations
0 block output operations
397 messages sent
1247 messages received
0 signals received
42 voluntary context switches
4948 involuntary context switches
435013936 instructions retired
704782682 cycles elapsed
4141056 peak memory footprint
gsutil sync bench
real 2.18
user 1.37
sys 0.66
46899200 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
100108 page reclaims
1732 page faults
0 swaps
0 block input operations
0 block output operations
6311 messages sent
12752 messages received
4 signals received
6145 voluntary context switches
14219 involuntary context switches
13133297 instructions retired
13313536 cycles elapsed
602112 peak memory footprint
gsutil rsync config
About authentication
All default functions related to authentication use GOOGLE_APPLICATION_CREDENTIALS env var as default conf like official Google libraries do on other languages (golang, dotnet)
Other functions (from and from_file) provide the custom integration mode.
For more info about OAuth2, see the related README in the oauth2 mod.
How to run tests
Unit tests
Integration tests + Unit tests
TEST_SERVICE_ACCOUNT=<PathToAServiceAccount> TEST_BUCKET=<BUCKET> TEST_PREFIX=<PREFIX> cargo
Examples
Upload object
Download object
Delete object
List objects
List objects with default service account
GOOGLE_APPLICATION_CREDENTIALS=<PathToJson> cargo
List objects
list a bucket having more than 60K objects
|
Profiling
Humans are terrible at guessing-about-performance
&&
Native bin build (static shared lib)
LDFLAGS="-static -L/usr/local/musl/lib" LD_LIBRARY_PATH=/usr/local/musl/lib: CFLAGS="-I/usr/local/musl/include" PKG_CONFIG_PATH=/usr/local/musl/lib/pkgconfig