kdb_codec 1.1.0

Kdb+ IPC codec library for handling kdb+ wire protocol data with Rust.
Documentation
# Fuzzing Guide for kdb_codec

This document explains how to run fuzz tests on the kdb_codec library to discover security vulnerabilities.

## Prerequisites

### 1. Install Nightly Toolchain

```bash
rustup toolchain install nightly
```

### 2. Add Required Target (macOS ARM/Apple Silicon)

If you're on Apple Silicon (M1/M2/M3):
```bash
rustup target add aarch64-apple-darwin --toolchain nightly
```

If you're on Intel Mac:
```bash
rustup target add x86_64-apple-darwin --toolchain nightly
```

If you're on Linux:
```bash
rustup target add x86_64-unknown-linux-gnu --toolchain nightly
```

### 3. Install cargo-fuzz

```bash
cargo install cargo-fuzz
```

Note: cargo-fuzz requires a nightly Rust toolchain with the appropriate target installed.

## Important: Working Directory

**All fuzzing commands must be run from the `kdb_codec` directory:**

```bash
cd kdb_codec
```

All examples below assume you are in the `kdb_codec` directory.

## Running Fuzz Tests

### 1. Fuzz `q_ipc_decode` (Deserialization)

This tests arbitrary byte sequences against the deserialization logic:

```bash
# From kdb_codec directory
cargo +nightly fuzz run fuzz_q_ipc_decode
```

**What it finds:**
- Panics from invalid type bytes
- Buffer overruns in list deserialization
- Integer overflows in size calculations
- UTF-8 validation issues
- Unbounded recursion

### 2. Fuzz `decompress_sync` (Decompression)

This specifically targets the decompression algorithm:

```bash
# From kdb_codec directory
cargo +nightly fuzz run fuzz_decompress
```

**What it finds:**
- Out-of-bounds reads in decompression loop
- Invalid back-references
- Decompression bombs
- Size field validation issues

### 3. Fuzz Complete Codec Decode Path

This tests the entire codec decoding pipeline:

```bash
# From kdb_codec directory
cargo +nightly fuzz run fuzz_codec_decode
```

**What it finds:**
- Header validation bypasses
- Message size handling issues
- Integration bugs between components

## Running with Memory Limit

To prevent the fuzzer from being killed by OOM, set a memory limit:

```bash
# Limit to 2GB RSS
cargo +nightly fuzz run fuzz_q_ipc_decode -- -rss_limit_mb=2048
```

## Running with Time Limit

```bash
# Run for 1 hour
cargo +nightly fuzz run fuzz_q_ipc_decode -- -max_total_time=3600
```

## Checking for Specific Issues

### Test for Slow Inputs (Hangs)

```bash
# Timeout after 10 seconds per input
cargo +nightly fuzz run fuzz_q_ipc_decode -- -timeout=10
```

### Test with Dictionary for Structure-Aware Fuzzing

The fuzzer can learn from valid inputs to generate more interesting test cases:

```bash
# Add valid kdb+ messages to the corpus
mkdir -p fuzz/corpus/fuzz_q_ipc_decode
# Add some valid serialized K objects here
cargo +nightly fuzz run fuzz_q_ipc_decode
```

## Analyzing Crashes

When the fuzzer finds a crash, it saves the input to `fuzz/artifacts/`:

```bash
# Reproduce a crash
cargo +nightly fuzz run fuzz_q_ipc_decode fuzz/artifacts/fuzz_q_ipc_decode/crash-xyz
```

## Minimizing Crash Cases

Reduce a crashing input to its minimal form:

```bash
cargo +nightly fuzz cmin fuzz_q_ipc_decode
```

## Coverage Report

Generate a coverage report to see what code paths are being tested:

```bash
cargo +nightly fuzz coverage fuzz_q_ipc_decode
```

## Continuous Fuzzing

For continuous integration, run fuzzing for a fixed time:

```bash
#!/bin/bash
# fuzz.sh - Run all fuzzers for 10 minutes each

FUZZ_TIME=600  # 10 minutes

echo "Fuzzing q_ipc_decode..."
cargo +nightly fuzz run fuzz_q_ipc_decode -- -max_total_time=$FUZZ_TIME

echo "Fuzzing decompress..."
cargo +nightly fuzz run fuzz_decompress -- -max_total_time=$FUZZ_TIME

echo "Fuzzing codec decode..."
cargo +nightly fuzz run fuzz_codec_decode -- -max_total_time=$FUZZ_TIME

echo "Fuzzing complete!"
```

## Expected Issues (Before Fixes)

Based on the security review, fuzzing is likely to find:

1. **Panics in `decompress_sync`**
   - Invalid compressed data
   - Size field < 8
   - Out-of-bounds reads

2. **Panics in deserialization**
   - Missing null terminators in symbols
   - Invalid UTF-8 sequences
   - Stack overflow from deep nesting

3. **Memory exhaustion**
   - Large size fields causing huge allocations
   - Decompression bombs

4. **Out-of-bounds access**
   - Invalid back-references in decompression
   - Buffer overruns in list construction

## After Implementing Fixes

After fixing the security issues:
- Panics should be eliminated (return errors instead)
- Memory should be bounded (size limits enforced)
- All bounds should be checked
- Run fuzzing for longer periods (hours/days) in CI

## Integration with CI

Add to `.github/workflows/fuzzing.yml`:

```yaml
name: Fuzzing

on:
  schedule:
    - cron: '0 2 * * *'  # Run nightly
  workflow_dispatch:

jobs:
  fuzz:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: dtolnay/rust-toolchain@nightly
      - name: Install cargo-fuzz
        run: cargo install cargo-fuzz
      - name: Run fuzzing
        run: |
          cd kdb_codec
          cargo +nightly fuzz run fuzz_q_ipc_decode -- -max_total_time=3600
      - name: Upload artifacts
        if: failure()
        uses: actions/upload-artifact@v3
        with:
          name: fuzz-artifacts
          path: kdb_codec/fuzz/artifacts/
```

## Resources

- [cargo-fuzz book]https://rust-fuzz.github.io/book/cargo-fuzz.html
- [libFuzzer documentation]https://llvm.org/docs/LibFuzzer.html
- [Rust Fuzz Trophy Case]https://github.com/rust-fuzz/trophy-case