sacp-cbor 0.12.0

# sacp-cbor

Strict canonical CBOR bytes validation + **zero-copy querying** + **canonical encoding** + **structural patching** (map/array edits) + optional **serde** + optional **SHA-256**.

This crate is intentionally **not** a general-purpose CBOR implementation. It enforces a **small, deterministic CBOR profile** designed for stable hashing, signatures, and safe interop.

---

## What you get

### Core capabilities

- **Validate** that an input is a *single, canonical* CBOR item under a strict profile (`validate_canonical`).
- Wrap validated bytes as `CanonicalCborRef<'a>` for **zero-copy querying** (`at`, `root`, `MapRef`, `ArrayRef`, `CborValueRef`).
- Optionally **decode** into Rust types via serde `from_slice` (`serde` + `alloc`).
- **Encode canonical CBOR** directly (`Encoder`, `MapEncoder`, `ArrayEncoder`) (`alloc`).
- Build canonical bytes with the **fallible** `cbor_bytes!` macro (`alloc`).
- **Patch/edit** canonical bytes without decoding the whole structure (`Editor`) (`alloc`).
- Optional:
  - **serde** conversion utilities (`serde`).
  - **SHA-256** helpers for canonical bytes / canonical-encoded values (`sha2`).

### Design constraints (important)

This crate enforces a strict “canonical profile”:

- **Single item** only (no trailing bytes).
- **Definite-length** only (indefinite lengths forbidden).
- **Map keys must be UTF-8 text strings**, and maps must be in **canonical order** (see below).
- **Integers**
  - “Safe” integers only: `[-(2^53-1), +(2^53-1)]`.
  - Larger magnitude integers must use **CBOR bignum tags**:
    - tag `2` (positive bignum)
    - tag `3` (negative bignum)
  - Bignum magnitudes must be canonical (non-empty, no leading zero) and must be **outside** the safe integer range.
- **Floats**
  - Only **float64** encoding is accepted/emitted.
  - **Negative zero** is forbidden.
  - **NaN** must use a single canonical NaN bit pattern.
- **Simple values**
  - Only `false`, `true`, and `null` are supported (plus float64, encoded under major type 7).
  - Other simple values are rejected.

If you need tags beyond bignums, indefinite lengths, non-text map keys, half/float32 encodings, etc., this crate is the wrong tool.

---

## Feature flags

This crate is `no_std` by default unless `std` is enabled.

| Feature | Enables | Notes |
|---|---|---|
| `std` | `std::error::Error` for `CborError` | Otherwise `no_std` |
| `alloc` | Owned types + encoding + editor + macros | Required for `CanonicalCbor`, `Encoder`, `Editor`, `cbor_bytes!` |
| `serde` | serde integration (`to_vec`, `from_slice`, etc.) | Requires `alloc` in practice; enables owned decoding via `from_slice` |
| `sha2` | SHA-256 helpers | Uses `sha2` crate |
| `simdutf8` | Faster UTF-8 validation | Optional SIMD validation, same semantics |
| `unsafe` | Unchecked UTF-8 for canonical-trusted reads | Uses `unsafe` only for canonical-validated inputs |

### Recommended dependency configs

**Default Rust (std + alloc):**
```toml
[dependencies]
sacp-cbor = "0.5"
```

**`no_std` + `alloc`:**
```toml
[dependencies]
sacp-cbor = { version = "0.5", default-features = false, features = ["alloc"] }
```

**`no_std` + `alloc` + serde + sha2:**
```toml
[dependencies]
sacp-cbor = { version = "0.5", default-features = false, features = ["alloc", "serde", "sha2"] }
```

> In Rust code the crate name is typically `sacp_cbor` (hyphen becomes underscore).

---

## Canonical profile rules

### Canonical map ordering (text keys only)

Maps must be sorted by the **encoded CBOR bytes of the key**, using:

1. **Encoded length** ascending (shorter encoded key bytes come first)
2. If equal length, **lexicographic** order of the encoded bytes

Because keys are text strings, the encoded key is:

- a text header (1/2/3/5/9 bytes depending on string length), followed by
- UTF-8 bytes of the key

For most “small keys” (< 24 bytes), the header is 1 byte, so the order is effectively:

- shorter key first, then
- lexicographic order of UTF-8 bytes

But note: at lengths 24, 256, 65536, … the header grows, which affects the encoded length ordering.

### Safe integer range

The safe integer range is:

- `MIN_SAFE_INTEGER = -(2^53 - 1)`
- `MAX_SAFE_INTEGER = +(2^53 - 1)`

Constants are exported:

- `MAX_SAFE_INTEGER: u64`
- `MAX_SAFE_INTEGER_I64: i64`
- `MIN_SAFE_INTEGER: i64`

Integers outside that range must be encoded as bignum (tag 2 or 3), and bignums are *required* to be outside safe range (i.e., you cannot represent a safe integer using a bignum).

### Float64 rules

- Only float64 encoding is allowed.
- `-0.0` is rejected.
- NaN must be canonicalized.

---

## Complexity model used in this README

- `n` = input byte length
- `d` = nesting depth
- `m` = number of entries in a map
- `a` = number of items in an array
- `k` = number of query keys in a multi-key operation
- “bytes scanned” means the implementation may need to walk CBOR structure boundaries using a value-end walker; this is proportional to the size of the traversed portion.

Where relevant, time complexity is **worst-case** unless noted.

---

## Quick start

### 1) Validate canonical bytes (no allocation required)

```rust
use sacp_cbor::{validate_canonical, DecodeLimits};

fn main() -> Result<(), sacp_cbor::CborError> {
  let input: &[u8] = /* ... */;

  // Choose limits (protects you from deep nesting / huge containers / etc.)
  let limits = DecodeLimits::for_bytes(input.len());

  // Validates: canonical, single item, strict profile
  let canon = validate_canonical(input, limits)?;

  // From here on you can do zero-copy queries:
  println!("validated {} bytes", canon.len());
  Ok(())
}
```

**Complexity**

- Time: `O(n)`
- Space: `O(d)` stack

  - **Without `alloc`**, validation uses a fixed inline stack sized for the default depth; extremely deep inputs can fail even if you raise `max_depth`.

### 2) Zero-copy query into a validated document

```rust
use sacp_cbor::{path, validate_canonical, DecodeLimits};

fn main() -> Result<(), sacp_cbor::CborError> {
  let bytes: &[u8] = /* canonical bytes */;

  let canon = validate_canonical(bytes, DecodeLimits::for_bytes(bytes.len()))?;

  // Navigate: root -> ["user"] -> ["id"]
  if let Some(id_ref) = canon.at(path!("user", "id"))? {
    let id = id_ref.integer()?.as_i64(); // Option<i64>, None if big integer
    println!("user.id: {id:?}");
  }

  Ok(())
}
```

**Complexity**

- `at(path)` time is proportional to what must be scanned in maps/arrays along the path:

  - Worst-case: `O(bytes scanned)`, often close to `O(n)` for pathological paths
  - Typical: shallow maps with early exits are much smaller
- Space: `O(1)`

---

## Limits and safety

### `DecodeLimits`

`DecodeLimits` is a public struct you pass to validation and decoding:

```rust
pub struct DecodeLimits {
  pub max_input_bytes: usize,
  pub max_depth: usize,
  pub max_total_items: usize,
  pub max_array_len: usize,
  pub max_map_len: usize,
  pub max_bytes_len: usize,
  pub max_text_len: usize,
}
```

Use `DecodeLimits::for_bytes(max_message_bytes)` for a reasonable baseline:

- `max_depth = 256`
- `max_total_items = max_message_bytes`
- `max_array_len/max_map_len = min(max_message_bytes, 1<<16)`
- `max_bytes_len/max_text_len = max_message_bytes`

**Why limits matter**

- Prevents “CBOR bombs” (huge containers, deeply nested data).
- Controls worst-case time and memory for validation and decoding.

### `CborLimits`

If you need two distinct policies (e.g., “message” vs “state”):

```rust
use sacp_cbor::CborLimits;

let limits = CborLimits::new(1_000_000, 16_384)?;
let msg = limits.message_limits();
let state = limits.state_limits();
```

---

## Zero-copy query API

All query APIs operate on **validated canonical bytes** (via `CanonicalCborRef` / `CanonicalCbor`) and return lightweight views (`CborValueRef`) into the underlying buffer.

### `CanonicalCborRef<'a>`

How you obtain it:

- returned by `validate_canonical(&[u8], DecodeLimits)`

Key methods:

- `as_bytes() -> &'a [u8]` (`O(1)`)
- `len() -> usize` (`O(1)`)
- `is_empty() -> bool` (`O(1)`)
- `bytes_eq(other) -> bool` (`O(n)` compare)
- `root() -> CborValueRef<'a>` (`O(1)`)
- `at(path: &[PathElem]) -> Result<Option<CborValueRef>, CborError>`

  - Time: `O(bytes scanned)`
  - Space: `O(1)`

Optional:

- `sha256() -> [u8; 32]` (`sha2`) — `O(n)`
- `to_owned() -> Result<CanonicalCbor, CborError>` (`alloc`) — `O(n)` copy + alloc
- `editor()/edit(...)` (`alloc`) — see “Editing”

### `CanonicalCbor` (owned, `alloc`)

How you obtain it:

- `CanonicalCbor::from_slice(bytes, limits)` validates + copies
- or from an `Encoder` (`into_canonical()`)
- or from an `Editor::apply()`

Key methods:

- `as_bytes() -> &[u8]` (`O(1)`)
- `into_bytes() -> Vec<u8>` (`O(1)` move)
- `bytes_eq(&other) -> bool` (`O(n)`)
- `root()/at(...)` same as `CanonicalCborRef`
- `sha256()` (`sha2`) — `O(n)`
- `edit(...)` (`alloc`) — see “Editing”

### `PathElem` and `path!`

```rust
use sacp_cbor::{PathElem, path};

let p1: &[PathElem] = path!("a", "b", 0, "c"); // keys and indices
let p2: &[PathElem] = &[PathElem::Key("a"), PathElem::Index(0)];
```

- `PathElem::Key(&str)`
- `PathElem::Index(usize)`

**Complexity**

- Path construction is compile-time for literals; runtime cost is trivial.
- Query traversal cost depends on containers traversed.

### `CborValueRef<'a>`

`CborValueRef` is a view into a contiguous CBOR value within a canonical buffer.

Key methods (behavior + complexity):

- `as_bytes() -> &'a [u8]` — `O(1)`
- `offset() -> usize` — `O(1)` (byte offset in the original buffer)
- `len() -> usize` — `O(1)`
- `is_empty() -> bool` — `O(1)`

Type/category inspection:

- `kind() -> Result<CborKind, CborError>`

  - Time: `O(1)` for header; may read small tag headers
- `is_null() -> bool` — `O(1)`

Container access:

- `map() -> Result<MapRef<'a>, CborError>`

  - Errors: `ExpectedMap` if not a map, or `MalformedCanonical` if corrupt
- `array() -> Result<ArrayRef<'a>, CborError>`

  - Errors: `ExpectedArray`, `MalformedCanonical`
- `get_key(&str) -> Result<Option<CborValueRef>, CborError>` (map lookup)
- `get_index(usize) -> Result<Option<CborValueRef>, CborError>` (array lookup)
- `at(path) -> Result<Option<CborValueRef>, CborError>` (path traversal)

Scalar decoding (zero-copy where possible):

- `integer() -> Result<CborIntegerRef<'a>, CborError>`

  - Returns `Safe(i64)` or `Big(BigIntRef)`
  - Time: `O(1)` + reads magnitude bytes for bigints
  - Errors: `ExpectedInteger`, `MalformedCanonical`
- `text() -> Result<&'a str, CborError>`

  - Time: `O(len)` due to UTF-8 validation
- `bytes() -> Result<&'a [u8], CborError>`

  - Time: `O(1)`
- `bool() -> Result<bool, CborError>` — `O(1)`
- `float64() -> Result<f64, CborError>` — `O(1)`

### `MapRef<'a>`

Obtain via `CborValueRef::map()?`.

Map APIs assume:

- keys are **text**, and
- map is **canonical key-sorted**

Key methods:

- `len()`, `is_empty()` — `O(1)`

Single key lookup:

- `get(key: &str) -> Result<Option<CborValueRef>, CborError>`

  - Time: `O(bytes scanned in map until match or early-exit)`
  - Early-exit: once map key > query key (canonical order), returns `None`
  - Errors: `MalformedCanonical`, or `LengthOverflow` if query key is absurdly large

- `require(key) -> Result<CborValueRef, CborError>`

  - Same as `get`, but returns `MissingKey` if not found

Multi-key lookup:

- `get_many_sorted<const N: usize>(keys: [&str; N]) -> Result<[Option<CborValueRef>; N], CborError>`
- `require_many_sorted<const N: usize>(keys: [&str; N]) -> Result<[CborValueRef; N], CborError>`

These functions:

- validate key sizes
- internally sort an index array by canonical key encoding
- scan the map once (merge-like scan)

**Complexity**

- Time: `O(k log k * L + bytes scanned in map)`
  where `k = N`, `L` = average key length used in comparisons.
- Space: `O(k)` (small fixed arrays)

Dynamic multi-key lookup (`alloc`):

- `get_many(keys: &[&str]) -> Result<Vec<Option<CborValueRef>>, CborError>`
- `require_many(keys: &[&str]) -> Result<Vec<CborValueRef>>, CborError>`
- `get_many_into(keys, out)` (writes into caller-provided slice)

**Complexity**

- Time: `O(k log k * L + bytes scanned in map)`
- Space: `O(k)` for sorting indices (unless you provide your own pre-sorted list and use `extras_sorted` patterns)

Iteration:

- `iter() -> impl Iterator<Item = Result<(&str, CborValueRef), CborError>>`

  - Full iteration: `O(bytes in map)`

Extras (fields not in a set of “used keys”):

- `extras_sorted(used_keys: &[&str]) -> Result<impl Iterator<...>, CborError>`

  - Requires `used_keys` to be **strictly increasing** in canonical key order (validated)
  - Time: `O(bytes in map + k)`
  - Space: `O(1)`

`alloc` helpers:

- `extras_sorted_vec(used_keys) -> Result<Vec<(&str, CborValueRef)>, CborError>`
- `extras_vec(used_keys) -> Result<Vec<(&str, CborValueRef)>, CborError>`

  - `extras_vec` sorts your keys internally (allocates)
  - Time: `O(k log k * L + bytes in map)`
  - Space: `O(k)` + output vec

### `ArrayRef<'a>`

Obtain via `CborValueRef::array()?`.

- `len()`, `is_empty()` — `O(1)`
- `get(index) -> Result<Option<CborValueRef>, CborError>`

  - Time: `O(bytes scanned up to index)` (because it walks item boundaries)
  - Space: `O(1)`
- `iter() -> impl Iterator<Item = Result<CborValueRef>, CborError>`

  - Full iteration: `O(bytes in array)`

---

### `CborInteger` / `BigInt` / `F64Bits`

- `CborInteger::safe(i64) -> Result<CborInteger, CborError>`
- `CborInteger::big(negative, magnitude: Vec<u8>) -> Result<CborInteger, CborError>`
- `BigInt::new(negative, magnitude: Vec<u8>) -> Result<BigInt, CborError>`

  - magnitude must be canonical and outside safe range
- `F64Bits::new(bits: u64) -> Result<F64Bits, CborError>`
- `F64Bits::try_from_f64(f64) -> Result<F64Bits, CborError>`

  - canonicalizes NaN and rejects -0.0

---

## Canonical encoding API (`alloc`)

If you want to produce canonical CBOR bytes directly, use `Encoder`.

### `Encoder`

Create:

- `Encoder::new()`
- `Encoder::with_capacity(usize)`

Extract:

- `into_vec() -> Vec<u8>` (not wrapped/validated)
- `into_canonical() -> CanonicalCbor` (assumes you used encoder correctly)
- `as_bytes() -> &[u8]` (current buffer)

Write scalars:

- `null()`, `bool(bool)`
- `int(i64) -> Result<(), CborError>` (safe range enforced)
- `bignum(negative, magnitude: &[u8]) -> Result<(), CborError>` (canonical + outside safe range enforced)
- `bytes(&[u8])`, `text(&str)`
- `float(F64Bits)`

Write composites:

- `array(len, |&mut ArrayEncoder| ...)`
- `map(len, |&mut MapEncoder| ...)`

Raw splice:

- `raw_cbor(CanonicalCborRef)` (copies bytes as-is into output)
- `raw_value_ref(CborValueRef)` (copies bytes as-is into output)

**Key rule:** When emitting maps via `Encoder::map`, you must insert entries in **canonical key order** using `MapEncoder::entry`. The encoder enforces this and will error if you violate it.

**Complexity**

- Encoding operations are proportional to the bytes written:

  - Time: `O(output_bytes)`
  - Space: output buffer + small stack
- `map` ordering checks compare encoded key bytes:

  - Additional time: `O(total key bytes)` across all entries

### `MapEncoder::entry`

Signature:

```rust
fn entry<F>(&mut self, key: &str, f: F) -> Result<(), CborError>
where
        F: FnOnce(&mut Encoder) -> Result<(), CborError>;
```

Properties:

- Key must be text (`&str`), always.
- Enforces:

  - **no duplicate keys**
  - **strict canonical order**
- On any error inside the closure `f`, the partially-written entry is rolled back (buffer truncated).

Errors you may see:

- `DuplicateMapKey`
- `NonCanonicalMapOrder`
- `MapLenMismatch` (if you write too many/few entries overall)
- plus anything your closure emits

**Complexity**

- Per entry: `O(key_len + value_bytes)` + ordering compare `O(key_len)`

### `ArrayEncoder`

You must write exactly `len` items; otherwise:

- `ArrayLenMismatch`

**Complexity**

- `O(total written bytes)`

---

## Macros (`alloc`)

### `cbor_bytes!` — build canonical bytes directly (fallible)

- Produces `Result<CanonicalCbor, CborError>`
- Uses `Encoder` internally
- Sorts map keys at compile time (no runtime buffering)
- Map keys must be identifiers or string literals

Example (keys can be written in any order):

```rust
use sacp_cbor::cbor_bytes;

let bytes = cbor_bytes!({
    "z": 3,
    "a": 1,
    "b": 2,
})?;
```

Splicing existing canonical fragments (still copied into output, but no decoding/re-encoding):

```rust
use sacp_cbor::{cbor_bytes, validate_canonical, DecodeLimits};

let existing: &[u8] = /* canonical CBOR */;
let canon = validate_canonical(existing, DecodeLimits::for_bytes(existing.len()))?;

let out = cbor_bytes!([canon, 1, 2, 3])?; // array whose first element is the existing item
```

**Complexity**

- Time: `O(output_bytes)`
- Space: output buffer
- Map order enforcement: same as `Encoder`/`MapEncoder`

---

## Editing / patching canonical bytes (`alloc`)

The editor applies a set of mutations to an existing canonical document and emits new canonical bytes.

### High-level semantics

- The input must be canonical (you start from `CanonicalCborRef` or `CanonicalCbor`).
- Operations are specified by a **non-empty path** (`&[PathElem]`).

  - You cannot “replace the root value” via an empty path.
- Map edits can insert/delete keys; arrays support structural edits via splices.
- Array indices in edit paths are interpreted against the **original** array (before edits).

### Getting an editor

```rust
use sacp_cbor::{validate_canonical, DecodeLimits, path};

let bytes: &[u8] = /* canonical */;
let canon = validate_canonical(bytes, DecodeLimits::for_bytes(bytes.len()))?;

let edited = canon.edit(|ed| {
ed.set(path!("user", "name"), "alice")?;
ed.delete_if_present(path!("legacy"))?;
Ok(())
})?;
```

Or with owned bytes:

```rust
use sacp_cbor::{CanonicalCbor, DecodeLimits, path};

let owned = CanonicalCbor::from_slice(/*...*/, DecodeLimits::for_bytes(/*...*/))?;
let updated = owned.edit(|ed| {
ed.replace(path!("counter"), 42i64)?;
Ok(())
})?;
```

### `EditOptions`

```rust
use sacp_cbor::EditOptions;

ed.options_mut().create_missing_maps = true;
```

- `create_missing_maps: bool`

  - If `true`, missing **map** keys along the path may be created as new (empty or partially filled) maps.
  - This only creates **maps**, not arrays, and only when the editor can prove the needed structure.

### `Editor` operations

All return `Result<(), CborError>`.

Set operations:

- `set(path, value)` → Upsert semantics (arrays: replace element)
- `insert(path, value)` → InsertOnly (maps: error if key exists; arrays: insert before index)
- `replace(path, value)` → ReplaceOnly (maps: error if missing; arrays: replace element)
- `set_raw(path, CborValueRef)` → splice a raw value reference from the source document
- `set_encoded(path, |enc| { ... })` → compute the new value by encoding exactly one CBOR item

Delete operations:

- `delete(path)` → must exist (arrays: index must be in bounds)
- `delete_if_present(path)` → no error if missing (arrays: ignore out-of-bounds)

Array splices:

- `splice(array_path, pos, delete)` → returns a builder to insert values at `pos`
- `push(array_path, value)` / `push_encoded(array_path, |enc| ...)` → append to end

Finalize:

- `apply(self) -> Result<CanonicalCbor, CborError>`

### Supported value types for edits (`EditEncode`)

The editor accepts any `T: EditEncode` for `set/insert/replace`. `EditEncode` is sealed; only the
types listed below are supported.

Implemented out of the box:

- `bool`, `()`
- `&str`, `String`
- `&[u8]`, `Vec<u8>`
- `f32`, `f64`, `F64Bits`
- `i64`, `u64`, `i128`, `u128` (bignum encoding when outside safe range)
- `CanonicalCborRef`, `CanonicalCbor`, `&CanonicalCbor`

**Complexity**

- Converting `T` into an edit value usually means encoding a single CBOR item:

  - Time: `O(encoded_bytes_of_value)`
  - Space: may allocate a `Vec<u8>` for the encoded item unless you pass a `CanonicalCborRef`/`&CanonicalCbor`.

### Editor limitations (must-read)

- **No empty path**: attempting to edit the root directly yields `InvalidQuery`.
- **Array indices are relative to the original array** (before edits).
- **Splice constraints**:

  - Splice delete ranges must be in bounds.
  - Splices must not overlap; overlapping splices or edits inside deleted ranges yield `PatchConflict`.
- **Patch conflicts**:

  - Two operations that overlap (e.g., set `["a"]` and also set `["a","b"]`) yield `PatchConflict`.
- **Missing key semantics in maps**:

  - `replace` on a missing key → `MissingKey`
  - `delete` on a missing key → `MissingKey`
  - `delete_if_present` on missing key → OK
  - nested edits on missing keys:

    - if `create_missing_maps = true`, the editor may create missing maps
    - otherwise → `MissingKey`

### Editor performance / complexity

Let:

- `n` = input size in bytes
- `p` = number of patch operations (terminals)
- `u` = number of distinct modified keys within a specific map node

Applying an editor:

- Worst-case time: `O(n + Σ(u log u))`

  - It walks/rewrites the whole document once (`O(n)`)
  - For each patched map, it sorts the modified keys (`O(u log u)`)
- Space:

  - Output buffer: `O(output_bytes)`
  - Patch tree: `O(p)` nodes + key storage
- No full decode of the input is performed; values are copied forward unchanged unless touched.

---

## Serde integration (`serde` + `alloc`)

### Convert Rust types ↔ canonical CBOR bytes

- `to_vec<T: Serialize>(&T) -> Result<Vec<u8>, CborError>`
- `from_slice<T: DeserializeOwned>(bytes, limits) -> Result<T, CborError>`

```rust
use serde::{Serialize, Deserialize};
use sacp_cbor::{to_vec, from_slice, DecodeLimits};

#[derive(Serialize, Deserialize, Debug, PartialEq)]
struct Msg {
  typ: String,
  n: i64,
}

let msg = Msg { typ: "hi".into(), n: 5 };
let bytes = to_vec(&msg)?;

let decoded: Msg = from_slice(&bytes, DecodeLimits::for_bytes(bytes.len()))?;
assert_eq!(decoded, msg);
```

### Borrowed deserialization helpers

- `from_slice_borrowed<T: Deserialize>(bytes, limits) -> Result<T, CborError>`

### Serde limitations (important)

- Map keys must serialize as **text** (`&str`/`String`/`char` etc). Non-string keys fail with `MapKeyMustBeText`.
- Integer support via serde is limited to what serde exposes:

  - Very large bignums (more than 128 bits) cannot be losslessly represented through serde numeric primitives.
- Schema mismatches return `ErrorCode::SerdeError` (offset 0); structural parse errors preserve offsets when available.

---

## Hashing (`sha2`)

- `CanonicalCborRef::sha256() -> [u8; 32]`
- `CanonicalCbor::sha256() -> [u8; 32]`

**Complexity**

- Time: `O(n)` for bytes
- Space: `O(1)`

---

## Errors

### `CborError`

```rust
pub struct CborError {
  pub code: ErrorCode,
  pub offset: usize,
}
```

- `code`: machine-readable category
- `offset`: byte position in the input (or 0 for some logical/query errors)

### `ErrorCode` (high-level grouping)

- Limits / structure:

  - `InvalidLimits`, `MessageLenLimitExceeded`, `DepthLimitExceeded`, `TotalItemsLimitExceeded`,
    `ArrayLenLimitExceeded`, `MapLenLimitExceeded`, `BytesLenLimitExceeded`, `TextLenLimitExceeded`
- Canonical encoding violations:

  - `NonCanonicalEncoding`, `IndefiniteLengthForbidden`, `ReservedAdditionalInfo`, `TrailingBytes`
- Map/set rules:

  - `MapKeyMustBeText`, `DuplicateMapKey`, `NonCanonicalMapOrder`, `NonCanonicalSetOrder`
- Integers / tags:

  - `IntegerOutsideSafeRange`, `ForbiddenOrMalformedTag`, `BignumNotCanonical`, `BignumMustBeOutsideSafeRange`
- Floats:

  - `NegativeZeroForbidden`, `NonCanonicalNaN`
- Type expectation errors (query/edit):

  - `ExpectedMap`, `ExpectedArray`, `ExpectedInteger`, `ExpectedText`, `ExpectedBytes`,
    `ExpectedBool`, `ExpectedFloat`
- Editing:

  - `PatchConflict`, `IndexOutOfBounds`, `InvalidQuery`, `MissingKey`
- serde:

  - `SerdeError`
- Catch-alls:

  - `MalformedCanonical`, `UnexpectedEof`, `LengthOverflow`, `AllocationFailed`

---

## Public API index (with properties and complexity)

This section is intentionally exhaustive for day-to-day use. For full signatures, rely on rustdoc.

### Validation & limits

- `validate(bytes, limits) -> Result<(), CborError>`

  - Validates canonical + single item.
  - Time: `O(n)`, Space: `O(d)`

- `validate_canonical(bytes, limits) -> Result<CanonicalCborRef, CborError>`

  - Same as `validate`, but returns a typed wrapper.
  - Time: `O(n)`, Space: `O(d)`

- `DecodeLimits::for_bytes(max_message_bytes) -> DecodeLimits`

  - Convenience baseline limits.

- `CborLimits::new(max_message_bytes, max_state_bytes) -> Result<CborLimits, CborError>`

  - Enforces `max_state_bytes <= max_message_bytes`.

- `CborLimits::{message_limits,state_limits}() -> DecodeLimits`

  - Derives `DecodeLimits` for each budget.

### Typed decode/encode

- `decode(bytes, limits) -> Result<T, CborError>`
- `decode_canonical(canon_ref) -> Result<T, CborError>`
- `decode_canonical_owned(&canon) -> Result<T, CborError>` (`alloc`)
- `encode_to_vec(&value) -> Result<Vec<u8>, CborError>` (`alloc`)
- `encode_to_canonical(&value) -> Result<CanonicalCbor, CborError>` (`alloc`)
- `encode_into(&mut Encoder, &value) -> Result<(), CborError>` (`alloc`)

Common trait coverage for derive-driven models includes:

- fixed byte arrays: `[u8; N]` (CBOR byte strings with exact-length decode checks)
- ordered sets: `BTreeSet<T>` (`alloc`; canonical deterministic order, strict order validation on decode)
- canonical wrappers: `CanonicalCborRef<'a>` and `CanonicalCbor` (`alloc`)

### Bytes wrappers

- `CanonicalCborRef<'a>` (borrowed)

  - `as_bytes/len/is_empty/root` — `O(1)`
  - `bytes_eq` — `O(n)`
  - `at(path)` — `O(bytes scanned)`
  - `sha256` (`sha2`) — `O(n)`
  - `to_owned` (`alloc`) — `O(n)` alloc+copy
  - `editor/edit` (`alloc`) — see editing

- `CanonicalCbor` (`alloc`, owned)

  - `from_slice(bytes, limits)` — validates then copies (`O(n)`)
  - `as_bytes/into_bytes` — `O(1)`
  - query/edit methods same as `CanonicalCborRef`

### Query types

- `PathElem`: `Key(&str)` / `Index(usize)`

- `path!()` macro: builds `&[PathElem]` slice

- `CborValueRef<'a>`

  - scalar reads: mostly `O(1)` (text is `O(len)`)
  - container queries: `O(bytes scanned)`

- `MapRef<'a>`

  - `get/require`: `O(bytes scanned until match/early-exit)`
  - multi-key lookups: `O(k log k + bytes scanned)`
  - iter/extras: `O(bytes in map)` (+ optional key sorting costs)

- `ArrayRef<'a>`

  - `get`: `O(bytes scanned up to index)`
  - `iter`: `O(bytes in array)`

### Encoding (`alloc`)

- `Encoder`

  - streaming canonical CBOR output
  - maps require canonical key order; enforced

- `ArrayEncoder`, `MapEncoder`

  - enforce arity + map canonical ordering

### Editing (`alloc`)

- `Editor`

  - set/insert/replace/delete semantics with conflict detection
  - array indices refer to the original array; cannot edit root via empty path
  - Time: `O(n + Σ(u log u))` worst-case

### Macros (`alloc`)

- `cbor_bytes!` → `Result<CanonicalCbor, CborError>`

  - no sorting; order must already be canonical

### Serde (`serde` + `alloc`)

- `to_vec`, `from_slice`, `from_slice_borrowed`
- `from_canonical_bytes_ref`, `from_canonical_bytes` (for already-validated canonical bytes)
- numeric bignums are limited to `i128/u128` roundtrips through serde

---

## When to use what

- **You already have CBOR bytes and need fast reads:**
  `validate_canonical` → `CanonicalCborRef` → `at/get/iter`

- **You need to *emit* canonical CBOR efficiently:**
  `Encoder` / `cbor_bytes!`
  (ensure canonical map key order)

- **You need to patch existing canonical bytes without decoding everything:**
  `CanonicalCborRef::edit` / `CanonicalCbor::edit`

- **You need serde:**
  `to_vec/from_slice` (or `from_slice_borrowed` when you want borrows)
- **You already validated canonical bytes and want struct decode:**
  `from_canonical_bytes_ref` / `from_canonical_bytes`

---

## Notes for maintainers / auditors

- `unsafe` is forbidden (`#![forbid(unsafe_code)]`).
- The validator is intentionally strict and rejects many CBOR features by design.
- All offset-bearing errors aim to point at the byte position where the violation is detected (serde conversions generally return offset 0).

---

## Benchmarks

A separate benchmark workspace lives under `benchmarks/` and runs cross-crate CBOR benchmarks
with shared datasets. See `benchmarks/README.md` for setup and usage.