paks 0.1.2

A light-weight encrypted archive inspired by the Quake PAK format.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
/*!
PAKS file
=========

The PAKS file format is a light-weight encrypted archive inspired by the Quake PAK format.

Getting started
---------------

PAKS files can be inspected with the standard file IO with [`FileReader`] and [`FileEditor`], or from memory with [`MemoryReader`] and [`MemoryEditor`].

### Creating PAKS files

Using a [`MemoryEditor`] instance:

```
// Create a new memory editor and choose your encryption keys
let ref key = paks::Key::default();
let mut editor = paks::MemoryEditor::new();

// Add content to the PAKS file
let content = include_bytes!("../tests/data/example.txt");
editor.create_file(b"foo/example", content, key);

// Finish the PAKS file and write to disk
# /* Don't actually write the file while running tests...
editor.write("myfile.paks", key).unwrap();
# */
# let (blocks, _) = editor.finish(key);
```

Using a [`FileEditor`] instance:

```no_run
# // Don't actually write files while running tests...
// Create a new file editor and choose your encryption keys
let ref key = paks::Key::default();
let mut editor = paks::FileEditor::create_new("myfile.paks", key).unwrap();

// Add content to the PAKS file
let content = include_bytes!("../tests/data/example.txt");
editor.create_file(b"foo/example", content, key);

// Finish writing the PAKS file
editor.finish(key).unwrap();

// If the editor is dropped without calling finish
// any changes since creating the editor are lost
```

Consider using the `pakscmd` command-line application for bundling your assets separately.

### Reading PAKS files

Using a [`FileReader`] instance:

```
# #[cfg(not(miri))] {
// Construct the key and simply open the file.
let ref key = paks::Key::default();
let reader = paks::FileReader::open("tests/data/example.paks", key).unwrap();

// Lookup the file descriptor and read its data.
let data = reader.read(b"foo/example", key).unwrap();

// If the PAKS file was tampered with without knowing the key,
// reading the file will fail with an error.
# }
```

Using a bundled archive with [`BundleReader`] and [`static_bundle!`]:

```
paks::static_bundle!(EXAMPLE_PAKS = "../tests/data/example.paks");
let key = paks::Key::default();
let reader = paks::BundleReader::open(&EXAMPLE_PAKS, key).unwrap();

// The bundled reader keeps the directory encrypted and only decrypts
// descriptors lazily while traversing paths.
let data = reader.read(b"foo/example", reader.key()).unwrap();
```

File Format
-----------

The file begins with [`Header`], which contains the cryptographic nonce and
MAC required to decrypt the embedded [`InfoHeader`].
The smallest addressable unit in the format is a [`Block`]. The entire file can
be interpreted as a contiguous array of these blocks.

[`Section`] objects use 32-bit address and length fields that refer to blocks
rather than byte offsets. This design imposes a maximum file size of 64 GiB,
while individual files are limited to 4 GiB each.

#### Metadata and Directory

The [`InfoHeader`] includes a [`Section`] that points to the [`Directory`].

The directory represents the file hierarchy using a lightweight
[TLV (Type-Length-Value) structure](https://en.wikipedia.org/wiki/Type-length-value).
It is expected to be located at the end of the PAKS file.

#### Data Layout

File data is stored between the header and the directory, with no enforced
ordering.

When files are removed, their data is not immediately reclaimed, leaving gaps
in the file. These gaps can be eliminated through an explicit garbage
collection process, which rewrites the PAKS file to compact unused space.

#### Encryption and Authentication

Encryption (Speck128/128) and authentication (CBC-MAC) are mandatory and not
configurable.

These operations are applied on a per-file basis, meaning the entire PAKS file
does not need to be processed or verified upfront.
*/

use std::{cmp, fmt, mem, num, ops, slice, str};
use std::io::ErrorKind;

use dataview::Pod;

// Must be a macro, inline function does not work
// #[cfg(debug_assertions)]
// macro_rules! unsafe_assume {
// 	($cond:expr) => {
// 		assert!($cond);
// 	};
// }
// #[cfg(not(debug_assertions))]
// macro_rules! unsafe_assume {
// 	($cond:expr) => {
// 		if !$cond {
// 			unsafe { std::hint::unreachable_unchecked() }
// 		}
// 	};
// }

mod cipher;
mod crypt;

mod dir;
pub use self::dir::TreeArt;

mod directory;
pub use self::directory::*;

mod file_io;
pub use self::file_io::*;

mod memory;
pub use self::memory::*;

/// Block primitive.
///
/// A block is the smallest addressable unit of which the PAKS file is made.
/// It defines the size and alignment of the underlying storage.
pub type Block = [u64; 2];

/// Key type.
///
/// All PAKS files are encrypted with the Speck128/128 cipher.
pub type Key = [u64; 2];

/// Parses a hexadecimal string into a Key.
pub fn parse_key(s: &str) -> Result<Key, num::ParseIntError> {
	u128::from_str_radix(s, 16).map(|val| [(val & 0xffffffffffffffff) as u64, (val >> 64) as u64])
}

const BLOCK_SIZE: usize = mem::size_of::<Block>();
// const KEY_SIZE: usize = mem::size_of::<Key>();

/// Section object.
///
/// A section object defines a location in the PAKS file and its cryptographic nonce and MAC.
#[derive(Copy, Clone, Default, Eq, PartialEq, Hash)]
#[repr(C)]
pub struct Section {
	/// Offset in blocks to the start of the section.
	pub offset: u32,
	/// Length in blocks of the section.
	pub size: u32,
	/// Cryptographic nonce used for this section.
	pub nonce: Block,
	/// Cryptographic MAC used to authenticate this section.
	pub mac: Block,
}

impl Section {
	#[inline]
	fn range_usize(&self) -> ops::Range<usize> {
		self.offset as usize..(self.offset.wrapping_add(self.size)) as usize
	}
}

impl fmt::Debug for Section {
	fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
		f.debug_struct("Section")
			.field("offset", &self.offset)
			.field("size", &self.size)
			.field("nonce", &format_args!("[{:#x}, {:#x}]", self.nonce[0], self.nonce[1]))
			.field("mac", &format_args!("[{:#x}, {:#x}]", self.mac[0], self.mac[1]))
			.finish()
	}
}

unsafe impl Pod for Section {}

fn bytes2blocks(byte_size: u32) -> u32 {
	if byte_size == 0 { 0 } else { (byte_size - 1) / BLOCK_SIZE as u32 + 1 }
}

//----------------------------------------------------------------

/// The info header.
#[derive(Copy, Clone, Default, Eq, PartialEq, Hash)]
#[repr(C)]
pub struct InfoHeader {
	/// Version info value, should be equal to [`VERSION`](Self::VERSION).
	pub version: u32,
	pub _unused: u32,
	/// The section object describing the location of the directory.
	///
	/// Special note: the section size specifies the number of `Descriptors` not the number of blocks.
	pub directory: Section,
}

impl InfoHeader {
	/// File format version number.
	///
	/// This library is endian-sensitive; reading a PAKS file on a machine
	/// with the wrong endianness will cause the version check to fail.
	pub const VERSION: u32 = u32::from_ne_bytes(*b"PAK1");
}

impl fmt::Debug for InfoHeader {
	fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
		f.debug_struct("InfoHeader")
			.field("version", &self.version)
			.field("directory", &self.directory)
			.finish()
	}
}

/// The file header.
#[derive(Copy, Clone, Default, Debug, Eq, PartialEq, Hash)]
#[repr(C)]
pub struct Header {
	/// Cryptographic nonce used for the info header.
	pub nonce: Block,
	/// Cryptographic MAC used to authenticate the info header.
	pub mac: Block,
	/// Version information and directory section.
	///
	/// Note that this information is encrypted by design and must be decrypted before use.
	pub info: InfoHeader,
}

impl Header {
	const SECTION: Section = Section {
		offset: Header::BLOCKS_LEN as u32 - InfoHeader::BLOCKS_LEN as u32,
		size: InfoHeader::BLOCKS_LEN as u32,
		nonce: [0, 0],
		mac: [0, 0],
	};
}

//----------------------------------------------------------------

/// The file or directory descriptor.
#[derive(Copy, Clone, Default, Eq, PartialEq, Hash)]
#[repr(C)]
pub struct Descriptor {
	/// The content type of the descriptor.
	///
	/// If the content type is zero this is a directory descriptor, otherwise it is a file descriptor.
	/// The interpretation of a non-zero content type is left to the user of the API.
	pub content_type: u32,
	/// The content size of the descriptor.
	///
	/// Directory descriptors define it as the number of children contained in the directory.
	/// File descriptors define it as the size of the file in bytes.
	pub content_size: u32,
	/// The section object.
	///
	/// File descriptors use it to find and decrypt its contents.
	/// It is unused for directory descriptors.
	pub section: Section,
	/// The name of the descriptor, see [`name`](Self::name).
	pub name: Name,
	/// Extra meta section object, unused for now.
	pub meta: Section,
}

impl Descriptor {
	/// Creates a new empty descriptor with the given name, content type and size.
	///
	/// The descriptor is a directory descriptor if its `content_type` is zero.
	/// Its `content_size` specifies the number of children contained in the directory.
	///
	/// The descriptor is a file descriptor if its `content_type` is non-zero.
	/// The interpretation of this non-zero type is left to the user of the API.
	/// Its `content_size` specifies the size of the file in bytes.
	#[inline]
	pub fn new(name: &[u8], content_type: u32, content_size: u32) -> Descriptor {
		Descriptor {
			content_type,
			content_size,
			name: Name::from(name),
			..Descriptor::default()
		}
	}

	/// Creates an empty file descriptor.
	#[inline]
	pub fn file(name: &[u8]) -> Descriptor {
		Descriptor::new(name, 1, 0)
	}

	/// Creates a directory descriptor and given the number of children.
	#[inline]
	pub fn dir(name: &[u8], len: u32) -> Descriptor {
		Descriptor::new(name, 0, len)
	}

	/// Gets the descriptor's file name.
	#[inline]
	pub fn name(&self) -> &[u8] {
		self.name.get()
	}

	/// Is this a directory descriptor?
	#[inline]
	pub fn is_dir(&self) -> bool {
		self.content_type == 0
	}

	/// Is this a file descriptor?
	#[inline]
	pub fn is_file(&self) -> bool {
		self.content_type != 0
	}
}

impl fmt::Debug for Descriptor {
	fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
		f.debug_struct("Descriptor")
			.field("name", &self.name)
			.field("content_type", &self.content_type)
			.field("content_size", &self.content_size)
			.field("section", &self.section)
			.finish()
	}
}

//----------------------------------------------------------------

const NAME_BUF_LEN: usize = 40;

/// The descriptor name buffer.
///
/// The length of the name is stored in the last byte of the buffer.
#[derive(Copy, Clone, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct Name {
	buffer: [u8; NAME_BUF_LEN],
}

impl Default for Name {
	#[inline]
	fn default() -> Name {
		Name {
			buffer: [0u8; NAME_BUF_LEN]
		}
	}
}

impl Name {
	/// Gets the file name.
	#[inline]
	pub fn get(&self) -> &[u8] {
		let len = usize::min(self.buffer[NAME_BUF_LEN - 1] as usize, NAME_BUF_LEN - 1);
		&self.buffer[..len]
	}

	/// Sets the file name.
	///
	/// File names longer than the internal buffer's length are cut off.
	#[inline]
	pub fn set(&mut self, name: &[u8]) {
		self.buffer = [0u8; NAME_BUF_LEN];
		let len = usize::min(name.len(), NAME_BUF_LEN - 1);
		self.buffer[NAME_BUF_LEN - 1] = len as u8;
		self.buffer[..len].copy_from_slice(&name[..len]);
	}
}

impl<'a> From<&'a [u8]> for Name {
	#[inline]
	fn from(name: &'a [u8]) -> Name {
		let mut x = Name::default();
		x.set(name);
		return x;
	}
}

impl fmt::Debug for Name {
	fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
		str::from_utf8(self.get()).unwrap_or("ERR").fmt(f)
	}
}

//----------------------------------------------------------------

macro_rules! impl_blocks {
	($ty:ty) => {
		// Error if sizeof $ty is not a multiple of BLOCK_SIZE
		const _: [(); 0] = [(); mem::size_of::<$ty>() % BLOCK_SIZE];

		unsafe impl Pod for $ty {}

		impl $ty {
			const BLOCKS_LEN: usize = mem::size_of::<$ty>() / BLOCK_SIZE;
		}

		impl AsRef<[Block; Self::BLOCKS_LEN]> for $ty {
			fn as_ref(&self) -> &[Block; Self::BLOCKS_LEN] {
				unsafe { &*(self as *const _ as *const _) }
			}
		}
		impl AsRef<$ty> for [Block; <$ty>::BLOCKS_LEN] {
			fn as_ref(&self) -> &$ty {
				unsafe { &*(self as *const _ as *const _) }
			}
		}
		impl AsMut<[Block; Self::BLOCKS_LEN]> for $ty {
			fn as_mut(&mut self) -> &mut [Block; Self::BLOCKS_LEN] {
				unsafe { &mut *(self as *mut _ as *mut _) }
			}
		}
		impl AsMut<$ty> for [Block; <$ty>::BLOCKS_LEN] {
			fn as_mut(&mut self) -> &mut $ty {
				unsafe { &mut *(self as *mut _ as *mut _) }
			}
		}
		impl From<[Block; Self::BLOCKS_LEN]> for $ty {
			fn from(blocks: [Block; Self::BLOCKS_LEN]) -> $ty {
				unsafe { mem::transmute(blocks) }
			}
		}
		impl From<$ty> for [Block; <$ty>::BLOCKS_LEN] {
			fn from(header: $ty) -> [Block; <$ty>::BLOCKS_LEN] {
				unsafe { mem::transmute(header) }
			}
		}
	};
}

impl_blocks!(Header);
impl_blocks!(InfoHeader);
impl_blocks!(Descriptor);

#[test]
fn test_parse_key_examples() {
	assert_eq!(parse_key("0").unwrap(), [0, 0]);
	assert_eq!(parse_key("2a").unwrap(), [42, 0]);
	assert_eq!(parse_key("112233445566778899aabbccddeeff00").unwrap(), [0x99aabbccddeeff00, 0x1122334455667788]);
	assert!(parse_key("not-hex").is_err());
}

#[test]
fn test_print_sizes() {
	fn print_size<T>(name: &str) {
		println!("sizeof={:#x} (struct {})", std::mem::size_of::<T>(), name);
	}
	print_size::<Header>("Header");
	print_size::<InfoHeader>("InfoHeader");
	print_size::<Descriptor>("Descriptor");
	print_size::<Section>("Section");
	print_size::<Name>("Name");
}