protocrap 0.3.0

A small, efficient, and flexible protobuf implementation
Documentation
# Contributing to Protocrap

Thanks for your interest in contributing! This document covers the essentials for getting started.

## Development Setup

See [CLAUDE.md](CLAUDE.md) for build commands, architecture overview, and code structure.

**Quick start:**
```bash
# Build and test with Bazel (preferred)
./bazelisk.sh build //...
./bazelisk.sh test //...

# Or with Cargo
cargo build --features codegen
cargo test
```

## Why Bazel?

Cargo works fine for building and basic testing, but Bazel is preferred for full development because:

1. **Conformance tests** - Requires building the official protobuf conformance runner (C++), which Bazel handles seamlessly
2. **Bootstrap verification** - Bazel builds two codegen versions (local and crates.io) and verifies identical output
3. **Proto compilation** - Bazel integrates protoc for generating test descriptors
4. **Hermetic builds** - Consistent builds across machines

Don't worry if you're new to Bazel - `./bazelisk.sh` downloads the right version automatically.

## Running Tests

Before submitting a PR, ensure:

1. **Conformance tests pass**: `bazel test //conformance:conformance_test`
   - Expected: 2685 passes, 102 expected failures
2. **Unit tests pass**: `cargo test`
3. **Code compiles for no_std**: `cargo build -p no-std-test --target thumbv7m-none-eabi`

## Code Generation and Bootstrap

Protocrap is self-hosting: `src/descriptor.pc.rs` is generated by protocrap-codegen, which itself depends on protocrap.

### Two Dependencies

The codegen depends on protocrap in two ways:

1. **API dependency** - Codegen uses protocrap as a library (parsing descriptors, reflection, etc.). This is like any other code using protocrap.

2. **Table format agreement** - Codegen produces static tables, and the protocrap lib interprets them at runtime. Both sides must agree on the table format. This is a bidirectional logical contract, not a code dependency.

### What Bootstrap Tests

The bootstrap build uses the **published protocrap from crates.io** to build codegen. The `bootstrap_matches_test` verifies that this produces identical output to the local build. This tests (to some extent) that the current protocrap API remains compatible with the previous version.

```bash
# Regenerate descriptor.pc.rs (uses crates.io protocrap)
bazel run //:regen_descriptor

# Verify bootstrap and local codegen match
bazel test //:bootstrap_matches_test
```

### Three Scenarios

**1. descriptor.proto is updated**

Generated code embeds descriptor data as proto struct literals (not serialized bytes). If descriptor.proto adds new fields, the old generated types don't have those struct members. The codegen uses `DescriptorPool` (dynamic reflection) to construct descriptor data, so it works with any descriptor.proto version.

→ Update the protobuf bazel dependency, run `bazel run //:regen_descriptor`

**2. Codegen output changes (e.g., adding an accessor method)**

Changes to generated code that don't affect the table format. Typically these changes need to be backwards compatible, so that user code remains compilable with the new version. Breaking changes should be rare and should only happen on major updates.

→ Run `bazel run //:regen_descriptor` to update `descriptor.pc.rs`

**3. Table format changes (between codegen and protocrap lib)**

If you change the table format, you need a new `descriptor.pc.rs` with matching tables. The old crates.io protocrap remains API-compatible with codegen, so it can build a new compiler that produces the new table format. This bootstrap compiler generates `descriptor.pc.rs` that works with your new protocrap lib.

→ Run `bazel run //:regen_descriptor` - the bootstrap process handles this automatically

## Intentional Limitations

The following are **by design** and should not be "fixed":

- Unknown fields are discarded (no round-trip preservation)
- Proto2 extensions are silently dropped
- Maps are decoded as repeated key-value pairs
- Up to 127 optional fields per message
- Field numbers limited to 1-2047

See CLAUDE.md for the full list and rationale.

## Pull Request Guidelines

1. **Keep changes focused** - one feature or fix per PR
2. **Don't over-engineer** - minimal changes to solve the problem
3. **All tests must pass** - conformance, unit, and no_std build
4. **Update documentation** if changing public API

## Fallible Allocation

All arena operations return `Result`. When adding code that allocates:

```rust
// Correct - propagate errors
let s = String::from_str(value, arena)?;
msg.set_name(value, &mut arena)?;

// Wrong - ignores potential allocation failure
let s = String::from_str(value, arena);
```

## Questions?

Open an issue for questions about the codebase or proposed changes.