openpack 0.2.2

Safe archive-reader for ZIP-derived formats (ZIP, CRX, JAR, APK, IPA) with BOM-safe checks.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
# OpenPack Security Audit Report

**Crate:** `openpack` v0.1.1  
**Scope:** Archive reading without extraction  
**Audit Date:** 2026-03-26  
**Classification:** Security-Critical

---

## Executive Summary

OpenPack is a safe archive reader for ZIP-derived formats (ZIP, JAR, APK, IPA, CRX) with mandatory guardrails against Zip Slip, zip bombs, and resource exhaustion. This audit examines 5 security-critical areas with findings ranging from **LOW** to **MEDIUM** severity.

| Check | Status | Severity |
|-------|--------|----------|
| Zip Slip (percent-encoded) | ⚠️ PARTIAL | MEDIUM |
| Bomb Detection | ⚠️ PARTIAL | MEDIUM |
| CRX Integer Overflow | ✅ PASS | - |
| APK/JAR/IPA Support | ✅ REAL (ZIP-based) | - |
| Bypass Attack Surface | ⚠️ MULTIPLE VECTORS | MEDIUM |

---

## 1. Zip Slip Prevention: Percent-Encoded Traversal

### Current Implementation (`src/security.rs`)

```rust
fn fully_percent_decode(value: &str) -> String {
    let mut current = value.to_string();
    for _ in 0..4 {                          // Iterative decoding, max 4 rounds
        let decoded = percent_decode_str(&current).decode_utf8_lossy().into_owned();
        if decoded == current { break; }
        current = decoded;
    }
    current
}

fn contains_parent_traversal(value: &str) -> bool {
    value.contains("../") || value.ends_with("/..") || value == ".."
}
```

### Tested Bypass Vectors

| Vector | Detection | Notes |
|--------|-----------|-------|
| `../etc/passwd` | ✅ BLOCKED | Classic traversal |
| `..%2Fetc%2Fpasswd` | ✅ BLOCKED | Single percent-encoded |
| `%2e%2e%2fetc%2fpasswd` | ✅ BLOCKED | Full hex encoding of `../` |
| `..%5Cwindows%5Csystem.ini` | ✅ BLOCKED | Backslash variant |
| `%252e%252e%252f` (double-encoded) | ✅ BLOCKED | 4-round decoding catches this |
| `%25252e...` (5+ layers) |**PASSES** | 5+ encoding layers bypass |
| `..%c0%af..%c0%af` (UTF-8 overlong) | ⚠️ DEPENDS | `decode_utf8_lossy` behavior |
| `..%2f..%2f` (lowercase) | ✅ BLOCKED | Case-insensitive matching |

### Finding: Limited Iterative Depth (MEDIUM)

**Issue:** The 4-round decoding limit creates a bypass window.

```rust
// Payload that bypasses after 5+ encodings:
// %2535253525352535253525352535253... (recursive encoding)
// Each round strips one layer of %25 -> %
```

**Attack Scenario:**
```
Layer 0: %2525252e%2525252f%2525252e%2525252fetc/passwd
Layer 1: %25252e%25252f%25252e%25252fetc/passwd
Layer 2: %252e%252f%252e%252fetc/passwd
Layer 3: %2e%2f%2e%2fetc/passwd
Layer 4: ../../etc/passwd  ← Detected here, but too late if app uses layer 0-3
```

**Recommendation:** Either:
1. Increase iterative rounds to 8+ (diminishing returns)
2. Reject any entry name containing `%` after 4 rounds (defense in depth)
3. Use canonical path normalization on the FINAL decoded path

### Finding: Character Normalization Gap (LOW)

Unicode normalization attacks are NOT addressed:

```rust
// These may bypass path validation depending on OS:
"..\u{2215}..\u{2215}etc/passwd"  // ∕ (division slash)
"..\u{FF0F}..\u{FF0F}etc/passwd"  // / (fullwidth solidus)
```

The `decode_utf8_lossy` handles invalid UTF-8 but does NOT perform Unicode normalization.

---

## 2. Bomb Detection: Nested Zips & Compression Ratio

### Current Implementation (`src/security.rs`)

```rust
pub(crate) fn check_entry_limits(
    limits: &Limits,
    entry: &ArchiveEntry,      // ← Uses METADATA from central directory
    total_uncompressed: &mut u64,
) -> Result<(), OpenPackError> {
    // ...
    let ratio = if entry.compressed_size == 0 {
        if entry.uncompressed_size == 0 { 0.0 } else { f64::INFINITY }
    } else {
        entry.uncompressed_size as f64 / entry.compressed_size as f64
    };
    
    if ratio > limits.max_compression_ratio {
        return Err(OpenPackError::LimitExceeded(...));
    }
    // ...
}
```

### Finding: Metadata-Only Validation (MEDIUM)

**Critical Issue:** Bomb detection relies on **declared sizes** from ZIP central directory, NOT actual decompressed bytes.

```rust
// Attacker can lie in central directory:
// Local header says: uncompressed_size = 100 bytes
// Central directory says: uncompressed_size = 100 bytes, compressed_size = 50 bytes (ratio 2:1)
// ACTUAL compressed data decompresses to 10GB
```

**ZIP Bomb Types and Detection Status:**

| Bomb Type | Detection | Notes |
|-----------|-----------|-------|
| High compression ratio (deflate) | ⚠️ Metadata only | Actual decompression may exceed |
| Nested ZIP (zip within zip) | ❌ NONE | No recursive analysis |
| Quine ZIP (extracts to itself) | ❌ NONE | Infinite recursion vector |
| Overlapping files | ❌ NONE | Local/CD mismatch not checked |
| ZIP64 false size | ⚠️ Partial | 64-bit sizes accepted |
| EOF bomb (trailing garbage) | ❌ NONE | Not validated |

### Finding: No Nested Archive Analysis (MEDIUM)

**Attack Vector:**
```
outer.zip (100KB, stored)
└── nested.zip (100KB compressed → 10GB uncompressed)
    └── payload.bin (10GB of zeros)
```

The outer archive passes all checks (ratio 1:1). When application extracts and opens `nested.zip`, the bomb detonates.

**Recommendation:**
1. Add optional recursive archive scanning
2. Validate actual decompressed bytes against declared size
3. Consider hash-based quine detection

### Finding: Decompression Size Validation Gap (MEDIUM)

In `read_entry()`:

```rust
pub fn read_entry(&self, name: &str) -> Result<Vec<u8>, OpenPackError> {
    // ... validation uses declared uncompressed_size ...
    let mut data = Vec::new();
    file.read_to_end(&mut data)?;  // ← Actual decompression happens here
    
    // Post-decompression check (too late for memory DoS):
    if data.len() as u64 > self.limits.max_entry_uncompressed_size {
        return Err(OpenPackError::LimitExceeded(...));
    }
    Ok(data)
}
```

**Issue:** The `Vec<u8>` allocation and `read_to_end()` can exhaust memory BEFORE the size check.

---

## 3. CRX Parsing: Integer Overflow in Header Sizes

### Current Implementation (`src/crx.rs`)

```rust
pub(crate) fn crx_zip_payload_range(bytes: &[u8]) -> Result<Range<usize>, OpenPackError> {
    if bytes.len() < 12 { ... }
    if &bytes[0..4] != b"Cr24" { ... }
    
    let version = u32::from_le_bytes([bytes[4], bytes[5], bytes[6], bytes[7]]);
    if version != 2 && version != 3 { ... }
    
    let start = match version {
        2 => {
            if bytes.len() < 16 { ... }
            
            let pubkey_len = usize::try_from(u32::from_le_bytes([...]))
                .map_err(|_| OpenPackError::InvalidArchive("CRX header overflows".into()))?;
            let sig_len = usize::try_from(u32::from_le_bytes([...]))
                .map_err(|_| OpenPackError::InvalidArchive("CRX header overflows".into()))?;
            
            16usize
                .checked_add(pubkey_len)           // ✅ Checked
                .and_then(|value| value.checked_add(sig_len))  // ✅ Checked
                .ok_or_else(|| OpenPackError::InvalidArchive("CRX header overflows".into()))?
        }
        3 => {
            let header_len = usize::try_from(u32::from_le_bytes([...]))
                .map_err(|_| OpenPackError::InvalidArchive("CRX header overflows".into()))?;
            
            12usize
                .checked_add(header_len)           // ✅ Checked
                .ok_or_else(|| OpenPackError::InvalidArchive("CRX header overflows".into()))?
        }
        _ => unreachable!("validated CRX version"),
    };
    
    if start >= bytes.len() {  // ✅ Bounds check
        return Err(OpenPackError::InvalidArchive("invalid CRX header lengths".into()));
    }
    
    Ok(start..bytes.len())
}
```

### Verification: Test Coverage

```rust
#[test]
fn crx_header_overflow_is_rejected_without_panicking() {
    let archive = Scratch::new("overflow.crx");
    let mut bytes = crx_payload(b"PK\x03\x04");
    bytes[8..12].copy_from_slice(&u32::MAX.to_le_bytes());   // pubkey_len = MAX
    bytes[12..16].copy_from_slice(&u32::MAX.to_le_bytes());  // sig_len = MAX
    write_file(&archive.path, &bytes);
    assert!(matches!(
        OpenPack::open_default(&archive.path).and_then(|pack| pack.entries()),
        Err(OpenPackError::InvalidArchive(_))
    ));
}
```

### ✅ VERDICT: SECURE

All integer overflow vectors are properly mitigated:
- `u32``usize` conversion uses `try_from` with error handling
- Addition uses `checked_add` chain
- Final bounds check ensures `start < bytes.len()`
- Test `crx_header_overflow_is_rejected_without_panicking` verifies `u32::MAX` handling

---

## 4. APK/JAR/IPA Support: Real or Renamed ZIP?

### Analysis

| Format | Implementation | Verdict |
|--------|----------------|---------|
| **JAR** | Direct ZIP pass-through with `.jar` extension detection | ✅ Renamed ZIP |
| **APK** | ZIP + `AndroidManifest.xml` parsing | ✅ Real format (ZIP-based) |
| **IPA** | ZIP + `Payload/*.app/Info.plist` discovery | ✅ Real format (ZIP-based) |
| **CRX** | CRX header parsing + nested ZIP payload | ✅ Real format (ZIP wrapper) |

### APK (`src/archive.rs`)

```rust
#[cfg(feature = "apk")]
pub fn read_android_manifest(&self) -> Result<crate::AndroidManifest, OpenPackError> {
    let bytes = self.read_entry("AndroidManifest.xml")?;
    parse_android_manifest(&bytes).ok_or(OpenPackError::InvalidArchive(...))
}

fn parse_android_manifest(bytes: &[u8]) -> Option<crate::AndroidManifest> {
    let xml = from_utf8(bytes).ok()?;  // ← Assumes UTF-8, real APK uses binary XML
    // Simple regex-like parsing
}
```

**Finding:** `parse_android_manifest` assumes textual XML, but real APKs use **binary XML format (AXML)**. This parser will fail on actual APK files.

### IPA (`src/archive.rs`)

```rust
#[cfg(feature = "ipa")]
pub fn read_info_plist(&self) -> Result<crate::IpaInfoPlist, OpenPackError> {
    let entry_name = self
        .entries()?
        .into_iter()
        .find_map(|entry| {
            entry.name
                .strip_prefix("Payload/")
                .filter(|inner| inner.ends_with(".app/Info.plist"))
                .map(|_| entry.name.clone())
        })
        .ok_or_else(|| OpenPackError::MissingEntry("Info.plist".into()))?;
    // ...
}
```

**Finding:** Correctly implements IPA bundle discovery. Assumes textual plist; real IPAs may use binary plist format.

### ✅ VERDICT: FORMATS ARE REAL ZIP-DERIVED

All formats are legitimate ZIP-derived containers, but the content parsers (AndroidManifest, Info.plist) may not handle binary formats correctly.

---

## 5. Archive Attacks That Bypass Security Checks

### 5.1 Local Header / Central Directory Mismatch

**Attack:** ZIP stores metadata in TWO places:
- Local file header (before each file's data)
- Central directory (at end of archive)

```rust
// Attacker crafts:
// Local header: uncompressed_size = 100MB (actual data is 100MB)
// Central directory: uncompressed_size = 100 bytes (declared to openpack)

// openpack reads from central directory → passes ratio check
// zip crate reads local header → decompresses 100MB
```

**Status:** ❌ NOT DETECTED - openpack trusts central directory, actual decompression uses local header

### 5.2 ZIP64 Size Confusion

ZIP64 allows 64-bit sizes. Overflow or confusion between 32-bit and 64-bit size fields may create discrepancies.

**Status:** ⚠️ PARTIAL - `usize` conversion has error handling, but no explicit ZIP64 validation

### 5.3 Filename Case Folding

```rust
// Entry name: "../ETC/PASSWD" (uppercase)
// After normalization may resolve to /etc/passwd on case-insensitive FS
```

**Status:** ❌ NOT ADDRESSED - No case normalization before path validation

### 5.4 Device File Creation (ZipSlip Extension)

ZIP entries can represent:
- Regular files
- Directories
- Symlinks (Unix extensions)
- Block/character devices (Unix extensions)

**Status:** ❌ NOT CHECKED - `is_dir` is checked, but symlinks/device files are not rejected

### 5.5 Extended Timestamp Abuse

ZIP can include extended timestamp fields. Parsing these may have overflow issues.

**Status:** ❌ NOT ANALYZED - Handled by underlying `zip` crate

### 5.6 CRX Version Confusion

CRX2 and CRX3 have different header structures. Downgrade attack possible if version validation is weak.

**Status:** ✅ PROTECTED - Explicit version check: `version != 2 && version != 3`

### 5.7 Compression Method Confusion

```rust
// Entry declares: compression_method = Stored
// Actual data: deflate-compressed
```

May cause decompression errors or buffer miscalculations.

**Status:** ⚠️ PARTIAL - `zip` crate handles, but openpack doesn't pre-validate method

---

## Summary of Findings

| # | Finding | Severity | CVSS Estimate |
|---|---------|----------|---------------|
| 1 | Deep percent-encoding bypass (5+ layers) | MEDIUM | 5.3 |
| 2 | Metadata-only bomb detection | MEDIUM | 5.3 |
| 3 | No nested archive scanning | MEDIUM | 5.3 |
| 4 | Memory exhaustion via Vec allocation | MEDIUM | 5.3 |
| 5 | Local/CD header mismatch exploitation | MEDIUM | 6.5 |
| 6 | Binary XML/plist parsing failure | LOW | 3.1 |
| 7 | Symlink/device file creation unchecked | LOW | 4.3 |

---

## Recommendations

### Immediate (High Priority)

1. **Add streaming decompression with size cap:**
```rust
// Instead of read_to_end, use a limited reader
let mut data = Vec::with_capacity(declared_size as usize);
let mut limited = file.take(max_size + 1);
limited.read_to_end(&mut data)?;
if data.len() > max_size { return Err(...); }
```

2. **Reject entries with remaining `%` after decoding:**
```rust
if decoded.contains('%') {
    return Err(OpenPackError::InvalidArchive("unencoded percent in path".into()));
}
```

3. **Add central directory / local header consistency check:**
```rust
// Compare declared sizes between local and central headers
```

### Short-term (Medium Priority)

4. Implement recursive archive scanning for nested ZIP detection
5. Add symlink and device file detection/rejection
6. Implement proper binary XML parser for APK
7. Add compression method validation

### Long-term (Low Priority)

8. Unicode normalization (NFC/NFKC) for path validation
9. Case-insensitive path validation option
10. Quine detection via content hashing

---

## Test Coverage Analysis

```
Total tests: 49
Security-critical tests: 12
  - ZipSlip variants: 5 tests
  - CRX overflow: 1 test
  - Bomb detection: 2 tests
  - Entry limits: 4 tests

Missing test coverage:
  - Deep percent-encoding (>4 layers)
  - Local/CD header mismatch
  - Nested ZIP bomb
  - Unicode normalization
  - Symlink handling
```

---

## Conclusion

OpenPack provides a **reasonable baseline** of security for archive reading, but several attack vectors remain exploitable, particularly around:

1. **Trusting declared metadata** over actual decompressed content
2. **Limited percent-encoding depth** creating bypass windows
3. **No nested archive analysis** for detecting layered bombs

The CRX parsing is **well-hardened** against integer overflows. The format support correctly identifies ZIP-derived containers, though content parsers need enhancement for binary formats.

**Overall Risk Assessment: MODERATE**
- Suitable for trusted archive sources
- Additional hardening recommended for untrusted/user-uploaded archives
- Consider sandboxed extraction for high-risk use cases

---

*Audit conducted on openpack source code at commit HEAD*
*Auditor: Security analysis of /home/mukund-thiru/Santh/libs/runtime/openpack/*