Crate arloader

Source
Expand description

SDK for uploading files in bulk to Arweave.

§CLI

See README.md for usage instructions.

The main cli application is all in main and follows a pattern of specifying arguments, matching them, and then in turn passing them to commands, all of which are included in the commands module in order to facilitate their use as library functions and re-use in other command line applications.

§Library

§Overview

The library is focused on uploading files as efficiently as possible. Arweave has two different transaction formats and two different upload formats. Transactions can either be normal, single data item transactions (see transaction format for details), or bundle transactions (see bundle format for details). The bundle format, introduced mid-2021, bundles together individual data items into larger transactions, making uploading much more efficient and reducing network congestion. The library supports both formats, with the recommended approach being to use the bundle format.

There are also two upload formats, whole transactions, which if they are less than 12 MB can be uploaded to the tx/ endpoint, and chunked transactions, which get uploaded in 256 KB chunks to the chunk/endpoint. Arloader includes functionality for both formats.

§Transactions and DataItems

Both transaction formats start with chunking file data and creating merkle trees from the chunks. The merkle tree logic can be found in the merkle module. All of the hashing functions and other crypto operations are in the crypto module. Once the data is chunked, hashed, and a merkle root calculated for it, it gets incorporated into either a Transaction, which can be found in the transaction module, or if it is going to be included in a bundle format transaction, a DataItem, which can be found in the bundle module.

§Tags

Tags are structs with name and value properties that can be included with either Transactions or DataItems. One subtlety is that for Transactions, Arweave expects the content at each key to be a base64 url encoded string, whereas for DataItems, Arweave expects utf8-encoded strings. Tags have been implemented for for both types as Tag<Base64> and Tag<String>. Another subtlety is that Tags for DataItems are serialized and deserialized using avro_rs, the schema of which is implemented in bundle::get_tags_schema.

A Tag with a name property of Content-Type is used by the Arweave gateways to communicate the mime type of the related content to browsers. Arloader creates a content type tag based on file extension if one is provided or from the bytes of the data using magic numbers if not.

§Bytes and Base64Url Data

The library stores all data, signatures and addresses as a Base64 struct with methods implemented for serializing and deserializing the underlying bytes to and from the base64 url format required for uploading to Arweave.

§Signing

A key part of constructing transactions is signing them. Arweave has a specific algorithm for generating the digest that gets signed and then hashed to serve as a transaction id, called deepHash. It takes various Transaction or DataItem elements, including nested arrays of Tags, and successively hashes and concatenates them together. Arloader assembles the required elements via the ToItems trait, which is implemented separately as Transaction::to_deep_hash_item and DataItem::to_deep_hash_item for each transaction format. crypto::Provider::deep_hash is Arloader’s implementation of the deep hash algorithm.

§Higher Level Functions

The functions for creating Transactions and bundles of DataItems are all consolidated on the Arweave struct. In general, there are lower level functions for creating single items from data that are then composed in successively higher level functions to allow multiple items to be created from collections of file paths and ultimately upload streams of transactions to Arweave.

§Status Tracking

The library includes additional functionality to track and report on transaction statuses. There are two status structs, Status and BundleStatus used for these purposes. They are essentially the same format, except that BundleStatus is modified to include references to all of the included DataItems instead of just a single Transaction for Status.

§Solana

The functions for allowing payment to be made in SOL can be found in the solana module.

Modules§

bundle
Data structure and functionality to create, serialize and deserialize DataItems.
commands
Functions for Cli commands comprised of library functions.
crypto
Functionality for creating and verifying signatures and hashing.
error
Errors propagated by library functions.
merkle
Functionality for chunking file data and calculating and verifying root ids.
solana
Functionality for funding transactions in SOL.
status
Data structures for reporting transaction statuses.
transaction
Data structures for serializing and deserializing Transactions and Tags.
utils
Async TempDir for testing.

Structs§

Arweave
Struct with methods for interacting with the Arweave network.
OraclePrice
OraclePricePair
PathsChunk
Tuple struct includes two elements: chunk of paths and aggregatge data size of paths.

Constants§

BLOCK_SIZE
Block size used for pricing calculations = 256 KB
CHUNKS_BUFFER_FACTOR
Multiplier applied to the buffer argument from the cli to determine the maximum number of simultaneous request to the chunk/ endpoint.
CHUNKS_RETRIES
Number of times to retry posting chunks if not successful.
CHUNKS_RETRY_SLEEP
Number of seconds to wait between retying to post a failed chunk.
MAX_TX_DATA
Maximum data size to send to tx/ endpoint. Sent to chunk/ endpoint above this.
WINSTONS_PER_AR
Winstons are a sub unit of the native Arweave network token, AR. There are 1012 Winstons per AR.

Functions§

file_stem_is_valid_txid
Used in updating BundleStatuss to determine whether a file stem includes a valid transaction id.
update_bundle_statuses_stream
Queries network and updates locally stored BundleStatus structs.
update_statuses_stream
Queries network and updates locally stored Status structs.
upload_bundles_stream
Uploads a stream of bundles from Vec<PathsChunk>s.
upload_bundles_stream_with_sol
Uploads a stream of bundles from Vec<PathsChunk>s, paying with SOL.
upload_files_stream
Uploads files matching glob pattern, returning a stream of Status structs.
upload_files_with_sol_stream
Uploads files matching glob pattern, returning a stream of Status structs, paying with SOL.
upload_transaction_chunks_stream
Uploads a stream of chunks from Vec<Chunk>s.