fastserial 0.1.2

Ultra-fast, zero-copy serialization/deserialization library for Rust with SIMD acceleration
Documentation
# Architecture

> Internal design of fastserial — read this before touching any source file.

## Table of Contents

1. [Design Goals]#design-goals
2. [Core Abstraction Model]#core-abstraction-model
3. [Encode Pipeline]#encode-pipeline
4. [Decode Pipeline]#decode-pipeline
5. [SIMD Layer]#simd-layer
6. [Proc-macro Codegen]#proc-macro-codegen
7. [Memory Model]#memory-model
8. [Format Plugin API]#format-plugin-api
9. [Error Handling Strategy]#error-handling-strategy
10. [`no_std` Compatibility]#no_std-compatibility

---

## Design Goals

| Goal | Decision |
|------|----------|
| Zero heap allocation on decode path | `&'de str` / `&'de [u8]` lifetimes, arena option |
| No virtual dispatch in hot path | Trait bounds monomorphize, no `dyn Encoder` |
| SIMD without `unsafe` user code | All intrinsics hidden inside `simd/` module, safe wrapper surface |
| Compile-time field ordering | Proc-macro encodes field indices as `u16`, checked at build time |
| serde compatibility | `SerdeCompat<T>` newtype bridges to serde traits |
| Pluggable formats | `Format` trait: implement 8 methods, get full encode/decode for free |

---

## Core Abstraction Model

```
User type (struct / enum)
    ▼ (proc-macro derives)
impl Encode for MyType   ←── WriteBuffer trait
impl Decode<'de> for MyType   ←── ReadBuffer<'de> trait
    │                              │
    ▼                              ▼
Format::encode_xxx()       Format::decode_xxx()
    │                              │
    ▼                              ▼
simd::write_str()          simd::scan_quote()
    │                              │
    ▼                              ▼
  &mut [u8] / Vec<u8>         &'de [u8]  (input slice, borrowed)
```

No closures, no function pointers, no `dyn` in the hot path. Everything resolves at monomorphization.

---

## Encode Pipeline

### Step 1 — Proc-macro generates specialized encode

Given:
```rust
#[derive(Encode)]
struct User<'a> {
    id:   u64,
    name: &'a str,
    age:  u8,
}
```

The proc-macro emits (approximately):
```rust
impl<'a> fastserial::Encode for User<'a> {
    #[inline(always)]
    fn encode<W: fastserial::io::WriteBuffer>(&self, w: &mut W) -> Result<(), fastserial::Error> {
        // Field names are byte literals — zero runtime cost
        w.write_bytes(b"{\"id\":")?;
        fastserial::codec::write_u64(self.id, w)?;
        w.write_bytes(b",\"name\":")?;
        fastserial::codec::write_str(self.name, w)?;
        w.write_bytes(b",\"age\":")?;
        fastserial::codec::write_u8(self.age, w)?;
        w.write_byte(b'}')?;
        Ok(())
    }

    // Schema fingerprint — checked at link time for cross-language compat
    const SCHEMA_HASH: u64 = 0xDEAD_BEEF_1234_5678;
}
```

Key points:
- `b"{\"id\":"` is a **compile-time constant** in `.rodata`, never allocated
- `write_u64` / `write_str` are `#[inline(always)]` — the compiler sees through them
- No match statements, no HashMap lookups, no dynamic field names

### Step 2 — WriteBuffer writes to output

`WriteBuffer` is a trait with two implementations:

| Implementation | When to use |
|----------------|-------------|
| `Vec<u8>` | Owned output, grows as needed |
| `&mut [u8]` (fixed) | Stack buffer, no allocation, returns `BufferFull` error |
| `io::Write` impl | For streaming to files/sockets |

The trait:
```rust
pub trait WriteBuffer {
    fn write_byte(&mut self, b: u8) -> Result<(), Error>;
    fn write_bytes(&mut self, bs: &[u8]) -> Result<(), Error>;
    // Default: reserve hint (no-op for Vec, checked for fixed)
    fn reserve(&mut self, _hint: usize) {}
}
```

---

## Decode Pipeline

### Zero-copy contract

The lifetime `'de` on `Decode<'de>` means: "the decoded value may borrow from the input buffer." The caller must keep the input alive as long as the decoded value is used.

```rust
pub trait Decode<'de>: Sized {
    fn decode(r: &mut ReadBuffer<'de>) -> Result<Self, Error>;
}
```

`ReadBuffer<'de>` wraps `&'de [u8]` and tracks current position:
```rust
pub struct ReadBuffer<'de> {
    data: &'de [u8],
    pos:  usize,
}
```

When decoding a `&'de str`, we return a sub-slice of `data` — no copy, no allocation.

### Decode dispatch

For JSON, the decode steps are:
1. Skip whitespace (SIMD: scan 32 bytes at a time for non-whitespace)
2. Peek at first byte to determine type (`"` → string, `{` → object, `[` → array, digit → number, `t/f` → bool, `n` → null)
3. Dispatch to specialized decoder
4. For objects: read field names as `&'de str`, use proc-macro generated match

Field matching for structs uses a **perfect hash** generated at compile time — no string comparisons in the hot path:
```rust
// Generated by proc-macro
match fastserial::phf::lookup(field_name_bytes) {
    0 => self.id   = Decode::decode(r)?,
    1 => self.name = Decode::decode(r)?,
    2 => self.age  = Decode::decode(r)?,
    _ => r.skip_value()?,  // unknown field — skip
}
```

---

## SIMD Layer

See [SIMD.md](SIMD.md) for full details. Summary:

```
Runtime CPU detection (once, at startup via std::arch::is_x86_feature_detected!)
    ├── AVX2 available  → avx2::dispatch()   [32-byte lanes]
    ├── SSE4.2 available → sse42::dispatch() [16-byte lanes]
    └── fallback         → scalar::dispatch() [byte-by-byte]
```

Operations accelerated by SIMD:
- `scan_for_quote_or_backslash` — find end of JSON string
- `skip_whitespace` — find first non-whitespace byte
- `validate_utf8_chunk` — fast UTF-8 validation
- `write_escaped_str` — escape special characters while copying to output

---

## Proc-macro Codegen

See [CODEGEN.md](CODEGEN.md) for full details.

The `fastserial-derive` crate is a separate proc-macro crate (required by Rust's proc-macro isolation rules). It:
1. Parses the input `TokenStream` using `syn`
2. Extracts field names, types, attributes
3. Emits `impl Encode` and `impl Decode<'de>` blocks
4. Computes `SCHEMA_HASH` from field names + types (deterministic, cross-language)

Attributes supported:
```rust
#[derive(Encode, Decode)]
struct MyStruct {
    #[fastserial(rename = "userId")]   // JSON key override
    user_id: u64,

    #[fastserial(skip)]                // excluded from encode/decode
    _cache: Option<String>,

    #[fastserial(default)]             // use Default::default() if missing
    flags: u32,

    #[fastserial(flatten)]             // inline fields of nested struct
    metadata: Metadata,
}
```

---

## Memory Model

### Allocation budget per operation

| Operation | Allocations |
|-----------|-------------|
| `json::encode(&val)` | 1 (Vec for output) |
| `json::encode_into(&val, &mut buf)` | **0** |
| `json::decode::<T<'de>>(&bytes)` | **0** (all `&str` / `&[u8]`) |
| `json::decode::<T>(&bytes)` (owned) | 1 per `String` field |
| `binary::decode_mmap::<T>(file)` | **0** (mmap-backed) |

### Arena allocator support (optional feature `arena`)

```rust
use fastserial::arena::Arena;

let arena = Arena::with_capacity(4096);
let decoded: MyStruct<'_> = json::decode_in(&bytes, &arena).unwrap();
// All String fields allocated from arena — freed together
```

---

## Format Plugin API

To add a new format (e.g., CBOR, Avro):

```rust
pub trait Format {
    // Primitive write
    fn write_null(w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_bool(v: bool, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_u64(v: u64, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_i64(v: i64, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_f64(v: f64, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_str(v: &str, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_bytes(v: &[u8], w: &mut impl WriteBuffer) -> Result<(), Error>;

    // Structural write
    fn begin_object(n_fields: usize, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn write_field_key(key: &[u8], w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn end_object(w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn begin_array(len: usize, w: &mut impl WriteBuffer) -> Result<(), Error>;
    fn end_array(w: &mut impl WriteBuffer) -> Result<(), Error>;

    // Primitive read
    fn read_bool(r: &mut ReadBuffer<'_>) -> Result<bool, Error>;
    fn read_u64(r: &mut ReadBuffer<'_>) -> Result<u64, Error>;
    // ... etc
    fn read_str<'de>(r: &mut ReadBuffer<'de>) -> Result<&'de str, Error>;
}
```

The proc-macro calls these methods — implement `Format` and your format automatically supports all derived types.

---

## Error Handling Strategy

```rust
#[non_exhaustive]
pub enum Error {
    UnexpectedEof,
    InvalidUtf8 { byte_offset: usize },
    UnexpectedByte { expected: &'static str, got: u8, offset: usize },
    NumberOverflow { type_name: &'static str },
    UnknownField { name_bytes: Vec<u8> },   // only in strict mode
    BufferFull { needed: usize, available: usize },
    Custom(Box<dyn std::error::Error + Send + Sync>),
}
```

Design decisions:
- `#[non_exhaustive]` — allows adding variants without semver break
- No `anyhow` / `thiserror` dependency — error type is owned
- Byte offsets included — useful for debugging large payloads
- `Custom` variant for format-specific errors

---

## `no_std` Compatibility

With `default-features = false`:
- `alloc` crate used instead of `std` (requires `extern crate alloc`)
- `Vec<u8>` output still works (alloc)
- `&mut [u8]` fixed-buffer output works without alloc
- SIMD: enabled only when `target_feature` is set at compile time (no runtime detection)
- Format: JSON and binary; MessagePack available; no `io::Write` support

```toml
[dependencies]
fastserial = { version = "0.1", default-features = false, features = ["json", "binary"] }
```