Expand description
A Rust library for handling Nintendo’s N64-era vpk0 data compression
vpk0 is data compression scheme built on LZSS with Huffman coding for the
dictionary offsets and match lengths.
It was used at HAL Laboratory for three of their N64 games and later,
for encoding the data on GBA E-Reader cards.
This crate provides decoding and encoding for vpk0 files.
As it is an old scheme, the decoding and encoding is designed to match what
Nintendo did in the 1990s; the crate does not focus on compression ratio or speed.
§Usage
The decode() and encode() functions provide quick ways to deal with
vpk0 data.
use vpk0::{encode, decode};
use std::io::Cursor;
let raw = Cursor::new(data);
let compressed = encode(raw).unwrap();
let decompressed = decode(Cursor::new(&compressed)).unwrap();
assert_eq!(&data, &decompressed);For more control, you can use Decoder or Encoder:
use vpk0::Encoder;
Encoder::for_bytes(b"ababacdcdeaba")
.two_sample()
.encode_to_writer(std::io::stdout())
.unwrap();§vpk0 Background
The vpk0 format—named for the magic bytes—is thought to have been developed
at HAL Laboratories in the late 1990s. Three of their N64 games use the compression scheme:
- Super Smash Bros.
- Pokémon Snap
- Shigesato Itoi’s No. 1 Bass Fishing: Definitive Edition
The format next appeared in the mid-2000s as the compression used in the Nintendo e-Reader
for the GBA. This is where the format first received attention from the internet at large.
Tim Schuerewegen’s nevpk and Caitsith’s NVPK Tool and NEDEC Make were
open source implementations of vpk0 that came from reverse engineering the e-Reader.
This crate extends on their work to provide matching compression for HAL’s N64 titles.
§Format Overview
vpk0 is based on two fundamental encoding algorithms: LSZZ and Huffman Coding.
The two techniques were bouncing around the Japanese BSSes since the late 80s,
and together they comprise the backbone of many modern day encoding schemes like Deflate.
The vpk0 format is comparatively simpler: it is a variable length LZSS.
The input data is compressed by a standard LZSS implementation.
But instead of having fixed bit sizes for the dictionary offset and length, the sizes are variable.
The variable bit sizes are then encoded as a Huffman code, with the necessary
Huffman tree prepended to the encoded data.
For more info, see the documentation for the format module
§Implementation Details
This implementation is designed to be a byte-perfect match of the encoder used for Super Smash Bros. As of March 2021, the LZSS encoder is byte-matching, but the Huffman compression is not.
The matching LZSS encoding scheme is: after a found match, look ahead at the next byte to see if there is a longer match. Continue checking the next byte until a smaller or no match is found.
The encoder in this crate checks at most the next ten bytes,
as that was the maximum number necessary to match all 500 vpk0 encoded files in SSB64.
In the future, this parameter may become another option for Encoder.
§Advanced Usages
§Getting info from a vpk0 file
use vpk0::vpk_info;
let (header, trees) = vpk_info(vpkfile).unwrap();
println!("Original size: {} bytes", header.size);
println!("VPK encoded with method {}", header.method);
println!("Offsets: {} || Lengths: {}", trees.offsets, trees.lengths);§Encode like a standard LZSS
use vpk0::{Encoder, LzssSettings};
// use fixed length compression by setting the offset to 10 and the length to 6.
let compressed = Encoder::for_bytes(b"I am Sam. Sam I am.")
.one_sample()
.with_lzss_settings(LzssSettings::new(10, 6, 2))
.with_offsets("10")
.with_lengths("6")
.encode_to_vec();Modules§
Structs§
- Decoder
- Specify the decoding settings, such as logging, input, and output.
- Encoder
- Specify the encoding settings, such as window size, logging, input, and output
- Lzss
Settings - Configure the LZSS encoding that underlies
vpk0compression
Enums§
- Lzss
Backend - The algorithm used to find matches when encoding a
vpk0file
Functions§
- decode
- Decompress a
Reader ofvpk0data into aVec<u8> - decode_
bytes - Decompress a byte slice of
vpk0data into aVec<u8> - encode
- Compress a
Reader into avpk0Vec<u8> - encode_
bytes - Compress a
&[u8]into avpk0Vec<u8> - vpk_
info - Extract the
VpkHeaderandTreeInfofromvpk0data