pkgar 0.2.0

Redox Package Archive
Documentation

pkgar - Package Archive

pkgar refers to three related items - the file format, the library, and the command line executable.

The pkgar format is not designed to be the best format for all archive uses, only the best default format for packages on Redox OS. It is reproducible, meaning archiving a directory will produce the same results every time. It provides cryptographic signatures and integrity checking for package files. It also allows this functionality to be used without storing the entire package archive, by only storing the package header. Large files, compression, encryption, and random access are not optimized for. Little endian is currently assumed, as well as Unix mode flags.

This specification is currently a work in progress

File Format - .pkgar

pkgar is a format for packages may be delivered in a single file (.pkgar), or as a header file (.pkgar_head) with an associated data file (.pkgar_data). The purpose of this is to allow downloading a header only and verifying local files before downloading file data. Concatenating the header and data files creates a valid single file: cat example.pkgar_head example.pkgar_data > example.pkgar

All data specified below is little-endian.

Header Portion

The header portion is designed to contain the data required to verify files already installed on disk. It is signed using NaCl (or a compatible implementation such as libsodium), and contains the blake3, offset, size, mode, and name of each file. The user and group IDs are left out intentionally, to support the installation of a package either as root or as a user, for example, in the user's home directory.

Header Struct

The size of the header struct is 136 bytes. All fields are packed.

  • signature - 512-bit (64 byte) NaCl signature of header data
  • public_key - 256-bit (32 byte) NaCl public key used to generate signature
  • blake3 - 256-bit (32 byte) blake3 sum of the entry data
  • count - 32-bit count of entry structs, which starts immediately after header struct
  • flags - 32-bit bitflags contains what data is represented

Data Flags

The data flags represent what data it contained, stored as 32 bitflags.

  • bit 0-8, enumeration from 0-255 represent the data and entry struct version:
    • 0: initial version
    • others: reserved
  • bit 9-16, enumeration from 0-255 represent the binary achitecture it contains:
    • 0: architecture-independent
    • 1: x86_64, base arch (x86_64-v1)
    • 2: 32 bit x86, base arch (i586)
    • 3: aarch64, base arch (Armv8-A)
    • 4: riscv64, base arch (extension GC)
    • others: reserved
  • bit 17-24, enumeration from 0-255 represent how the data file is packaged:
    • 0: not compressed
    • 1: LZMA2, per-entry data file compression
    • others: reserved
  • bit 25-31, reserved

Entry Struct

The size of the entry struct is 308 bytes. All fields are packed.

  • blake3 - 256-bit (32 byte) blake3 sum of the file data
  • offset - 64-bit offset of file data in the data portion
  • size - 64-bit size in bytes of the file data in the data portion
  • mode - 32-bit Unix permissions (user, group, other with read, write, execute)
  • path - 256 byte NUL-terminated relative path from extract directory

Data Portion

The data portion is used to look up file data only. It could be compressed to produce a .pkgar_data.gz file, for example. It can be removed after the install is completed. It is possible for it to contain holes, invalid data, or unreferenced data - so long as the blake3 of files identified in the header are still valid. This data should be removed when an archive is rebuilt.

The data format depends on the package format:

  • 0: Raw data.
  • 1: 64-bit uncompressed size, followed by LZMA2 compressed data.

Operation

A reader should first verify the header portion's signature matches that of a valid package source. Then, they should locate the entry for the file of interest. If desired, they can check if a locally cached file matches the referenced blake3. If this is not the case, they may access the data portion and verify that the data at the offset and length in the header entry matches the blake3. In that case, the data may be retrieved.

Development

To run the integration tests, you'll need to have pkgar-keys in your $PATH (or the $PATH of the test script). Clone the repo from https://gitlab.redox-os.org/MggMuggins/pkgar-keys and run cargo install --path .. Use test.sh to run the integration tests.