Skip to main content

Crate cpt

Crate cpt 

Source
Expand description

§Compact Pro

Rust library to read compressed files created by Compact Pro

§Usage

This crate offers the high-level functions probe and verify to check if a reader is a (valid) Archive.

use cpt::Fork;
use std::io::Read;

let mut archive = cpt::Archive::open("sample-files/FRED.CPT")
                               .unwrap();

assert_eq!(archive.iter().unwrap().count(), 7);

for entry in archive.iter().unwrap() {
  let Ok(entry) = entry else {continue};
  let Some(file) = entry.as_file() else {continue;};

  if file.name == "Freddie Manual" {
    let mut reader = archive.open_entry(&entry, Fork::Data).unwrap();
    let mut buffer = vec![0u8; file.data_uncompressed_size as usize];
    reader.read_exact(&mut buffer).unwrap();

    // … do something with extracted data …
  }
}

§Limitiations

The crate has not been tested with multi-volume archives, or encrypted entries.

§File Format

Note: This section was copied from https://code.google.com/archive/p/theunarchiver/wikis/CompactProSpecs.wiki for proper table rendering

§Archive header

The only magic number for a Compact Pro file is that the first byte is “0x01”. All types larger than one byte in size are stored in big-endian order, as expected from a Mac format.

OffsetSizeMeaning
01File identifier, 0x01
11Volume number (meaning not entirely clear, is 0x01 for single-volume archives)
22Cross-volume magic number (meaning not entirely clear)
44Offset to file headers from beginning of file

The offset points to a second part of the archive header, followed by file and directory headers. The second part is:

OffsetSizeMeaning
04CRC-32 of the header
42Total number of files and directories
61Comment length
7NComment with the length indicated above

Next come as many file or directory headers as indicated by the header.

§File header
OffsetSizeMeaning
01Name length and type flag. The highest bit of this field is zero for a file entry.
1NFile name with the length indicated above
N+11Volume number (meaning not entirely clear)
N+24Offset to file data from beginning of file
N+24Offset to file data
N+64Mac OS file type
N+104Mac OS file creator
N+144Creation date in classic Mac OS format (seconds since 1904)
N+184Modification date in classic Mac OS format (seconds since 1904)
N+222Mac OS Finder flags
N+244Uncompressed file data CRC. This is calculated for the concaternation of the resource and data forks.
N+282File flags (see below)
N+304Resource fork uncompressed length
N+344Data fork uncompressed length
N+384Resource fork compressed length
N+424Data fork compressed length

The flags field contains at least the following bits:

BitMeaning
0File is encrypted (the algorithm for this is unknown)
1Resource fork uses LZH compression
2Data fork uses LZH compression
§Directory header
OffsetSizeMeaning
01Name length and type flag. The highest bit of this field is one for a directory entry.
1NDirectory name with the length indicated above
N+12Total number of files and directory in this directory, counting files in sub-directories

The directory structure can be entirely inferred from the entry count field.

§Algorithms

Files are compressed using either just an RLE algorithm, or RLE followed by an LZSS and Huffman based algorithm, depending on bits 1 and 2 of the flags as described above. The resource and data forks are compressed separately even though the CRC is calculated for their combination. The resource fork appears first in the file, followed by the data fork. Thus the offset to the resource fork is the same as the “Offset to file data” field, while the offset to the data fork is “Offset to file data” plus “Resource fork compressed length”.

The algorithms are described on these pages:

  • Rle8182Algorithm - An RLE encoding
  • CompactProLzhAlgorithm - An LZSS+Huffman encoding

§Testing

The test “suite” uses sample files from https://sembiance.com/fileFormatSamples/archive/compactPro/ and can be downloaded using wget:

wget --directory-prefix=sample-files --no-clobber --recursive --no-parent --no-host-directories --cut-dirs 3 https://sembiance.com/fileFormatSamples/archive/compactPro/

After the files have been downloaded just run cargo test as usual:

cargo test

Re-exports§

pub use error::Error;
pub use structs::Entry;
pub use structs::Flags;
pub use macintosh_utils::chrono;
pub use macintosh_utils::fourcc;

Modules§

error
structs
On-disk structures of CPT files

Macros§

fourcc

Structs§

Archive
A structure representing a CPT archive
FourCC
A four-character code

Enums§

Fork
Identifies either of the two forks that can be used to store data on classic Macintosh file systems

Functions§

probe
Detect if the given reader is a CPT archive
verify
Verify the structure and checksums of the given reader