Module opendal::docs::rfcs::rfc_0423_command_line_interface
source · Expand description
- Proposal Name:
command_line_interface
- Start Date: 2022-07-08
- RFC PR: datafuselabs/opendal#423
- Tracking Issue: datafuselabs/opendal#422
Summary
Add command line interface for OpenDAL.
Motivation
Q: There are so many cli out there, why we still need a cli for OpenDAL?
A: Because there are so many cli out there.
To manipulate our date store in different could service, we need to install different clis:
Those clis provide native and seamless experiences for their own products but also lock us and our data.
However, for 80% cases, we just want to do simple jobs like cp
, mv
and rm
. It’s boring to figure out how to use them:
aws --endpoint-url http://127.0.0.1:9900/ s3 cp data s3://testbucket/data --recursive
azcopy copy 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive
gsutil cp data gs://testbucket/
Can we use them in the same way? Can we let the data flow freely?
Let’s look back OpenDAL’s slogan:
Open Data Access Layer that connect the whole world together
This is a natural extension for OpenDAL: providing a command line interface!
Guide-level explanation
OpenDAL will provide a new cli called: oli
. It’s a shortcut of OpenDAL Command Line Interface
.
Users can install this cli via:
cargo install oli
Or using they favourite package management:
# Archlinux
pacman -S oli
# Debian / Ubuntu
apt install oli
# Rocky Linux / Fedora
dnf install oli
# macOS
brew install oli
With oli
, users can:
- Upload files to s3:
oli cp books.csv s3://bucket/books.csv
- Download files from azblob:
oli cp azblob://bucket/books.csv /tmp/books.csv
- Move data between storage services:
oli mv s3://bucket/dir azblob://bucket/dir
- Delete all files:
oli rm -rf s3://bucket
oli
also provide alias to make cloud data manipulating even natural:
ocp
foroli cp
ols
foroli ls
omv
foroli mv
orm
foroli rm
ostat
foroli stat
oli
will provide profile management so users don’t need to provide credential every time:
oli profile add my_s3 --bucket test --access-key-id=example --secret-access-key=example
ocp my_s3://dir /tmp/dir
Reference-level explanation
oli
will be a separate crate apart from opendal
so we will not pollute the dependencies of opendal
. But oli
will be releases at the same time with the same version of opendal
. That means oli
will always use the same (latest) version of opendal.
Most operations of oli
should be trivial, we will propose new RFCs if requiring big changes.
oli
won’t keep configuration. All config will go through environment, for example:
OIL_COLOR=always
OIL_CONCURRENCY=16
Besides, oil
will read profile from env like cargo
:
OIL_PROFILE_TEST_TYPE=s3
OIL_PROFILE_TEST_ENDPOINT=http://127.0.0.1:1090
OIL_PROFILE_TEST_BUCKET=test_bucket
OIL_PROFILE_TEST_ACCESS_KEPT_ID=access_key_id
OIL_PROFILE_TEST_SECRET_ACCESS_KEY=secret_access_key
With those environments, we can:
ocp path/to/dir test://test/to/dir
Drawbacks
None
Rationale and alternatives
s3cmd
s3cmd is a command line s3 client for Linux and Mac.
Usage: s3cmd [options] COMMAND [parameters]
S3cmd is a tool for managing objects in Amazon S3 storage. It allows for
making and removing "buckets" and uploading, downloading and removing
"objects" from these buckets.
Commands:
Make bucket
s3cmd mb s3://BUCKET
Remove bucket
s3cmd rb s3://BUCKET
List objects or buckets
s3cmd ls [s3://BUCKET[/PREFIX]]
List all object in all buckets
s3cmd la
Put file into bucket
s3cmd put FILE [FILE...] s3://BUCKET[/PREFIX]
Get file from bucket
s3cmd get s3://BUCKET/OBJECT LOCAL_FILE
Delete file from bucket
s3cmd del s3://BUCKET/OBJECT
Delete file from bucket (alias for del)
s3cmd rm s3://BUCKET/OBJECT
Restore file from Glacier storage
s3cmd restore s3://BUCKET/OBJECT
Synchronize a directory tree to S3 (checks files freshness using
size and md5 checksum, unless overridden by options, see below)
s3cmd sync LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR
Disk usage by buckets
s3cmd du [s3://BUCKET[/PREFIX]]
Get various information about Buckets or Files
s3cmd info s3://BUCKET[/OBJECT]
Copy object
s3cmd cp s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]
Modify object metadata
s3cmd modify s3://BUCKET1/OBJECT
Move object
s3cmd mv s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]
Modify Access control list for Bucket or Files
s3cmd setacl s3://BUCKET[/OBJECT]
Modify Bucket Policy
s3cmd setpolicy FILE s3://BUCKET
Delete Bucket Policy
s3cmd delpolicy s3://BUCKET
Modify Bucket CORS
s3cmd setcors FILE s3://BUCKET
Delete Bucket CORS
s3cmd delcors s3://BUCKET
Modify Bucket Requester Pays policy
s3cmd payer s3://BUCKET
Show multipart uploads
s3cmd multipart s3://BUCKET [Id]
Abort a multipart upload
s3cmd abortmp s3://BUCKET/OBJECT Id
List parts of a multipart upload
s3cmd listmp s3://BUCKET/OBJECT Id
Enable/disable bucket access logging
s3cmd accesslog s3://BUCKET
Sign arbitrary string using the secret key
s3cmd sign STRING-TO-SIGN
Sign an S3 URL to provide limited public access with expiry
s3cmd signurl s3://BUCKET/OBJECT <expiry_epoch|+expiry_offset>
Fix invalid file names in a bucket
s3cmd fixbucket s3://BUCKET[/PREFIX]
Create Website from bucket
s3cmd ws-create s3://BUCKET
Delete Website
s3cmd ws-delete s3://BUCKET
Info about Website
s3cmd ws-info s3://BUCKET
Set or delete expiration rule for the bucket
s3cmd expire s3://BUCKET
Upload a lifecycle policy for the bucket
s3cmd setlifecycle FILE s3://BUCKET
Get a lifecycle policy for the bucket
s3cmd getlifecycle s3://BUCKET
Remove a lifecycle policy for the bucket
s3cmd dellifecycle s3://BUCKET
List CloudFront distribution points
s3cmd cflist
Display CloudFront distribution point parameters
s3cmd cfinfo [cf://DIST_ID]
Create CloudFront distribution point
s3cmd cfcreate s3://BUCKET
Delete CloudFront distribution point
s3cmd cfdelete cf://DIST_ID
Change CloudFront distribution point parameters
s3cmd cfmodify cf://DIST_ID
Display CloudFront invalidation request(s) status
s3cmd cfinvalinfo cf://DIST_ID[/INVAL_ID]
aws-cli
aws-cli is the official cli provided by AWS.
$ aws s3 ls s3://mybucket
LastWriteTime Length Name
------------ ------ ----
PRE myfolder/
2013-09-03 10:00:00 1234 myfile.txt
$ aws s3 cp myfolder s3://mybucket/myfolder --recursive
upload: myfolder/file1.txt to s3://mybucket/myfolder/file1.txt
upload: myfolder/subfolder/file1.txt to s3://mybucket/myfolder/subfolder/file1.txt
$ aws s3 sync myfolder s3://mybucket/myfolder --exclude *.tmp
upload: myfolder/newfile.txt to s3://mybucket/myfolder/newfile.txt
azcopy
azcopy is the new Azure Storage data transfer utility.
azcopy copy 'C:\myDirectory\myTextFile.txt' 'https://mystorageaccount.blob.core.windows.net/mycontainer/myTextFile.txt'
azcopy copy 'https://mystorageaccount.blob.core.windows.net/mycontainer/myTextFile.txt' 'C:\myDirectory\myTextFile.txt'
azcopy sync 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive
gsutil
gsutil is a Python application that lets you access Cloud Storage from the command line.
gsutil cp [OPTION]... src_url dst_url
gsutil cp [OPTION]... src_url... dst_url
gsutil cp [OPTION]... -I dst_url
gsutil mv [-p] src_url dst_url
gsutil mv [-p] src_url... dst_url
gsutil mv [-p] -I dst_url
gsutil rm [-f] [-r] url...
gsutil rm [-f] [-r] -I
Unresolved questions
None.
Future possibilities
None.