Expand description
SDK for uploading files in bulk to Arweave.
CLI
See README.md for usage instructions.
The main cli application is all in main
and follows a pattern of specifying arguments,
matching them, and then in turn passing them to commands, all of which are included
in the commands
module in order to facilitate their use as library functions and
re-use in other command line applications.
Library
Overview
The library is focused on uploading files as efficiently as possible. Arweave has two different transaction formats and two different upload formats. Transactions can either be normal, single data item transactions (see transaction format for details), or bundle transactions (see bundle format for details). The bundle format, introduced mid-2021, bundles together individual data items into larger transactions, making uploading much more efficient and reducing network congestion. The library supports both formats, with the recommended approach being to use the bundle format.
There are also two upload formats, whole transactions, which if they are less than 12 MB can be
uploaded to the tx/
endpoint, and chunked transactions, which get uploaded in 256 KB chunks
to the chunk/
endpoint. Arloader includes functionality for both formats.
Transactions and DataItems
Both transaction formats start with chunking file data and creating merkle trees from the chunks.
The merkle tree logic can be found in the merkle
module. All of the hashing functions and other crypto
operations are in the crypto
module. Once the data is chunked, hashed, and a merkle root
calculated for it, it gets incorporated into either a Transaction
, which can be found in the
transaction
module, or if it is going to be included in a bundle format transaction, a DataItem
,
which can be found in the bundle
module.
Tags
Tag
s are structs with name
and value
properties that can be included with either Transaction
s or
DataItem
s. One subtlety is that for Transaction
s, Arweave expects the content at each key to be a base64 url
encoded string, whereas for DataItem
s, Arweave expects utf8-encoded strings. Tag
s have been implemented for
for both types as Tag<Base64>
and Tag<String>
. Another subtlety is that Tag
s for DataItem
s are serialized
and deserialized using avro_rs
, the schema of which is implemented in bundle::get_tags_schema
.
A Tag
with a name property of Content-Type
is used by the Arweave gateways to communicate the mime type of
the related content to browsers. Arloader creates a content type tag based on file extension if one is provided
or from the bytes of the data using magic numbers if not.
Bytes and Base64Url Data
The library stores all data, signatures and addresses as a Base64
struct with methods implemented for
serializing and deserializing the underlying bytes to and from the base64 url format required for uploading
to Arweave.
Signing
A key part of constructing transactions is signing them. Arweave has a specific algorithm for generating the
digest that gets signed and then hashed to serve as a transaction id, called deepHash.
It takes various Transaction
or DataItem
elements, including nested arrays of Tag
s, and successively
hashes and concatenates them together. Arloader assembles the required elements via the ToItems
trait, which
is implemented separately as Transaction::to_deep_hash_item
and DataItem::to_deep_hash_item
for each transaction
format. crypto::Provider::deep_hash
is Arloader’s implementation of the deep hash algorithm.
Higher Level Functions
The functions for creating Transaction
s and bundles of DataItem
s are all consolidated on the Arweave
struct.
In general, there are lower level functions for creating single items from data that are then composed in successively
higher level functions to allow multiple items to be created from collections of file paths and ultimately upload streams
of transactions to Arweave.
Status Tracking
The library includes additional functionality to track and report on transaction statuses. There are two status structs,
Status
and BundleStatus
used for these purposes. They are essentially the same format, except that
BundleStatus
is modified to include references to all of the included DataItem
s instead of just a
single Transaction
for Status
.
Solana
The functions for allowing payment to be made in SOL can be found in the solana
module.
Modules
Functions for Cli commands comprised of library functions.
Functionality for creating and verifying signatures and hashing.
Errors propagated by library functions.
Functionality for chunking file data and calculating and verifying root ids.
Functionality for funding transactions in SOL.
Data structures for reporting transaction statuses.
Data structures for serializing and deserializing Transaction
s and Tag
s.
Structs
Struct with methods for interacting with the Arweave network.
Tuple struct includes two elements: chunk of paths and aggregatge data size of paths.
Constants
Block size used for pricing calculations = 256 KB
Multiplier applied to the buffer argument from the cli to determine the maximum number
of simultaneous request to the chunk/ endpoint
.
Number of times to retry posting chunks if not successful.
Number of seconds to wait between retying to post a failed chunk.
Maximum data size to send to tx/
endpoint. Sent to chunk/
endpoint above this.
Winstons are a sub unit of the native Arweave network token, AR. There are 1012 Winstons per AR.
Functions
Used in updating BundleStatus
s to determine whether a file stem includes a valid transaction id.
Queries network and updates locally stored BundleStatus
structs.
Queries network and updates locally stored Status
structs.
Uploads a stream of bundles from Vec<PathsChunk>
s.
Uploads a stream of bundles from Vec<PathsChunk>
s, paying with SOL.
Uploads files matching glob pattern, returning a stream of Status
structs.
Uploads files matching glob pattern, returning a stream of Status
structs, paying with SOL.
Uploads a stream of chunks from Vec<Chunk>
s.