pack-io 0.3.0

Compact binary wire format with schema evolution and zero-copy deserialization for Rust. The serialization substrate under network-protocol and Hive DB.
Documentation
# pack-io v0.3.0 — Wire-format freeze + collections + streaming

**The hard part of the roadmap, intentionally not deferred.** v0.3.0 ships the normative wire-format specification, freezes the byte shape for the entire `1.x` line, brings the standard library's collection surface under the codec (`Vec`, `HashMap`, `BTreeMap`, `HashSet`, `BTreeSet`), and adds the streaming `IoEncoder<W>` / `IoDecoder<R>` pair so the same `Serialize` / `Deserialize` impls work through `std::io::Write` / `Read` end-to-end. 177 tests pass on stable and MSRV 1.85 across Linux / macOS / Windows.

## What's new in 0.3.0

### Wire-format freeze ([`docs/WIRE_FORMAT.md`](https://github.com/jamesgober/pack-io/blob/main/docs/WIRE_FORMAT.md))

The byte-level specification is normative as of this release. It is written so a reader who has never seen the source code can implement a compatible encoder or decoder. Every primitive encoding, every compound type, the canonical map / set ordering rule, the full error taxonomy, and the allocation-cap defence are spelled out with MUST / MUST NOT semantics.

From this release onward, any `1.x` decoder reads any `1.x`-or-earlier encoding. Changes that would break the format require a `2.x` major version bump and are not accepted inside the `1.x` line. The wire format is now a contract, not a moving target.

### Collections

`Vec<T>`, `BTreeMap<K, V>`, `BTreeSet<T>`, `HashMap<K, V, S>`, and `HashSet<T, S>` all implement `Serialize` and `Deserialize`. The hash-based collections (`HashMap`, `HashSet`) are gated on the default `std` feature.

The wire shape is `varint(count) ++ entries`, with entries sorted **lexicographically by their encoded key bytes**. This canonical ordering is the load-bearing property — a `HashMap` and a `BTreeMap` holding the same logical data encode to **identical bytes**, regardless of insertion order or build-flag-dependent hash randomisation:

```rust
use std::collections::{BTreeMap, HashMap};

let mut h: HashMap<&str, u32> = HashMap::new();
h.insert("zeta", 26); h.insert("alpha", 1); h.insert("mu", 13);

let mut b: BTreeMap<&str, u32> = BTreeMap::new();
b.insert("alpha", 1); b.insert("mu", 13); b.insert("zeta", 26);

assert_eq!(pack_io::encode(&h).unwrap(), pack_io::encode(&b).unwrap());
```

Without this property, hashing, signing, or content-addressing a `HashMap` payload would fall over the first time a producer changed insertion order. With it, the encoded bytes are a function of the logical data alone.

### Streaming codec — `IoEncoder<W>` / `IoDecoder<R>`

The Tier-2 codec now comes in two flavours:

- **In-memory** ([`Encoder`](https://github.com/jamesgober/pack-io/blob/main/src/codec.rs), [`Decoder`](https://github.com/jamesgober/pack-io/blob/main/src/codec.rs)) — unchanged from v0.2.
- **Streaming** ([`IoEncoder<W>`](https://github.com/jamesgober/pack-io/blob/main/src/io.rs), [`IoDecoder<R>`](https://github.com/jamesgober/pack-io/blob/main/src/io.rs)) — wraps any `std::io::Write` / `Read`. Plus single-shot helpers [`encode_into`](https://github.com/jamesgober/pack-io/blob/main/src/io.rs) and [`decode_from`](https://github.com/jamesgober/pack-io/blob/main/src/io.rs) for the common case of "write one message into this writer" / "read one message from this reader".

```rust
use pack_io::{IoEncoder, IoDecoder};
use std::fs::File;
use std::io::{BufReader, BufWriter};

// Write directly into a file, no intermediate Vec<u8>.
let file = File::create("data.pack")?;
let mut enc = IoEncoder::new(BufWriter::new(file));
enc.write(&("hello", 42_u64))?;

// Read directly from the file.
let file = File::open("data.pack")?;
let mut dec = IoDecoder::new(BufReader::new(file));
let (s, n): (String, u64) = dec.read()?;
assert_eq!((s.as_str(), n), ("hello", 42));
# Ok::<(), Box<dyn std::error::Error>>(())
```

The new `SerialError::Io { kind, message }` variant captures `std::io::ErrorKind` and a stringified message, preserving `Clone + Eq` on `SerialError` while still letting callers branch on the kind.

### Trait refactor — `Encode` and `Decode` (breaking change)

`Serialize` and `Deserialize` are now generic over two new behaviour traits, `Encode` and `Decode`. One hand-rolled impl works through every encoder flavour the crate ships — in-memory **and** streaming — without duplication:

```rust
use pack_io::{Decode, Encode, Result, Serialize, Deserialize};

struct Point { x: i32, y: i32 }

impl Serialize for Point {
    fn serialize<E: Encode + ?Sized>(&self, enc: &mut E) -> Result<()> {
        self.x.serialize(enc)?;
        self.y.serialize(enc)
    }
}

impl Deserialize for Point {
    fn deserialize<D: Decode + ?Sized>(dec: &mut D) -> Result<Self> {
        Ok(Point {
            x: i32::deserialize(dec)?,
            y: i32::deserialize(dec)?,
        })
    }
}
```

The breaking change is contained: only callers who wrote hand-rolled `Serialize` / `Deserialize` impls in v0.2 are affected. Code that calls `encode()` / `decode()` / `Encoder::write()` / `Decoder::read()` is unaffected.

### `Encode` / `Decode` defaults — write the minimum, get the rest

Implementors of the new behaviour traits only have to provide the byte-level primitives:

- `Encode`: `write_byte`, `write_bytes`. Defaults handle `reserve`, `write_varint_u64`, `write_varint_u128`.
- `Decode`: `read_byte`, `read_into`, `max_alloc`. Defaults handle `read_varint_u64`, `read_varint_u128`, `read_length_prefixed`.

Concrete implementations like [`Decoder`](https://github.com/jamesgober/pack-io/blob/main/src/codec.rs) override `read_length_prefixed` for a faster zero-extra-validation path (the in-memory decoder already knows the remaining buffer length).

### Test suite — invariants enforced at every level

177 tests, all green on stable and MSRV 1.85:

| Suite | Count | Covers |
|---|---|---|
| Unit tests (in-source) | 62 | every primitive impl, varint corners, codec mechanics |
| [tests/roundtrip.rs](https://github.com/jamesgober/pack-io/blob/main/tests/roundtrip.rs) | 25 | `decode(encode(v)) == v` for every primitive |
| [tests/determinism.rs](https://github.com/jamesgober/pack-io/blob/main/tests/determinism.rs) | 21 | same value → same bytes, per primitive |
| [tests/adversarial.rs](https://github.com/jamesgober/pack-io/blob/main/tests/adversarial.rs) | 20 | random bytes / truncations / hostile lengths |
| [tests/collections.rs](https://github.com/jamesgober/pack-io/blob/main/tests/collections.rs) | 18 | collection round-trip + canonical-encoding contract + adversarial |
| [tests/streaming.rs](https://github.com/jamesgober/pack-io/blob/main/tests/streaming.rs) | 11 | `IoEncoder` vs `Encoder` byte equivalence + round-trip + I/O error mapping |
| Doctests | 20 | every public-item example compiles and runs |

The collection determinism suite explicitly tests the load-bearing property: `HashMap` and `BTreeMap` over the same data encode identically, and `HashMap` insertion order is irrelevant to the encoded output.

### Examples

```bash
cargo run --example collections_tour --release  # Vec / HashMap / BTreeMap / sets, canonical encoding
cargo run --example streaming_io --release      # Event records to a tempfile via IoEncoder / IoDecoder
```

(Plus the v0.2 examples: `basic_roundtrip`, `primitive_tour`, `reuse_buffer`.)

## Breaking changes

1. **Hand-rolled `Serialize` / `Deserialize` impls** — change the parameter type:
   - `fn serialize(&self, &mut Encoder)` → `fn serialize<E: Encode + ?Sized>(&self, &mut E)`
   - `fn deserialize(&mut Decoder<'_>)` → `fn deserialize<D: Decode + ?Sized>(&mut D)`
2. **`Encoder` no longer carries a `Config`** — `Config` is consumed only by the decode side now.
3. **Specialised `Vec<u8>` impls replaced by generic `Vec<T>` impls** — the wire format is identical; very-large-`Vec<u8>` decode is fractionally slower until the v0.6 optimisation pass restores a fast byte-slice path.

`pack_io::encode()` / `pack_io::decode()` / `Encoder::write()` / `Decoder::read()` are unaffected.

## Verification

Run on Windows x86_64, Rust stable + 1.85 (MSRV); identical commands pass on Linux (WSL2 Ubuntu) and via the configured CI matrix on macOS:

```bash
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
cargo build --no-default-features              # no_std build
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo audit
cargo deny check

cargo +1.85 clippy --all-targets --all-features -- -D warnings
cargo +1.85 test --all-features
RUSTDOCFLAGS="-D warnings" cargo +1.85 doc --no-deps --all-features
```

All green. Test counts at this tag (stable, `--all-features`):

- **62** unit tests.
- **25** round-trip property tests.
- **21** determinism property tests.
- **20** adversarial decode tests.
- **18** collection tests (round-trip, canonical encoding, adversarial).
- **11** streaming tests (in-memory vs streaming byte equivalence, I/O error mapping).
- **20** doctests.
- **177** total, every one passing.

All five example programs run end-to-end and round-trip their values.

## What's next

- **0.4.0 — `View<T>` zero-copy decode + `derive` macro.** `View<T>` exposes string and byte fields as borrows directly out of the input buffer, no per-message `String` / `Vec<u8>` allocation. The `#[derive(pack_io::Serialize, pack_io::Deserialize)]` proc-macros write sound impls for any struct or enum, so user code stops needing the hand-rolled four-line `serialize` / `deserialize` body for the common case.

## Installation

```toml
[dependencies]
pack-io = "0.3"

# no_std build:
pack-io = { version = "0.3", default-features = false }
```

MSRV: Rust 1.85 (2024 edition).

## Documentation

- [README](https://github.com/jamesgober/pack-io/blob/main/README.md)
- [API Reference](https://github.com/jamesgober/pack-io/blob/main/docs/API.md)
- [Wire Format Spec](https://github.com/jamesgober/pack-io/blob/main/docs/WIRE_FORMAT.md)
- [CHANGELOG](https://github.com/jamesgober/pack-io/blob/main/CHANGELOG.md)

---

**Full diff:** [`v0.2.0...v0.3.0`](https://github.com/jamesgober/pack-io/compare/v0.2.0...v0.3.0).
**Changelog:** [`CHANGELOG.md`](https://github.com/jamesgober/pack-io/blob/main/CHANGELOG.md#030---2026-05-28).