revision 0.25.0

A serialization and deserialization implementation which allows for schema-evolution.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
<br>

<!-- <p align="center">
    <a href="https://github.com/surrealdb/revision#gh-dark-mode-only" target="_blank">
        <img width="200" src="/img/white/logo.svg" alt="Revision Logo">
    </a>
    <a href="https://github.com/surrealdb/revision#gh-light-mode-only" target="_blank">
        <img width="200" src="/img/black/logo.svg" alt="Revision Logo">
    </a>
</p> -->

<p align="center">A framework for revision-tolerant serialization and deserialization, with support for schema evolution over time, allowing for easy revisioning of structs and enums for data storage requirements which need to support backwards compatibility, but where the design of the data format evolves over time.</p>

<br>

<p align="center">
    <a href="https://github.com/surrealdb/revision"><img src="https://img.shields.io/badge/status-beta-ff00bb.svg?style=flat-square"></a>
    &nbsp;
    <a href="https://docs.rs/revision/"><img src="https://img.shields.io/docsrs/revision?style=flat-square"></a>
    &nbsp;
    <a href="https://crates.io/crates/revision"><img src="https://img.shields.io/crates/v/revision?style=flat-square"></a>
    &nbsp;
    <a href="https://github.com/surrealdb/revision"><img src="https://img.shields.io/badge/license-Apache_License_2.0-00bfff.svg?style=flat-square"></a>
</p>

## Information

`Revision` is a framework for revision-tolerant serialization and deserialization with support for schema evolution over time. It allows for easy revisioning of structs and enums for data storage requirements which need to support backwards compatibility, but where the design of the data structures evolve over time. Revision enables data that was serialized at older revisions to be seamlessly deserialized and converted into the latest data structures. It uses [bincode](https://crates.io/crates/bincode) for serialization and deserialization. 

The `Revisioned` trait is automatically implemented for the following primitives: `u8`, `u16`, `u32`, `u64`, `u128`, `usize`, `i8`, `i16`, `i32`, `i64`, `i128`, `isize`, `f32`, `f64`, `char`, `String`, `Vec<T>`, Arrays up to 32 elements, `Option<T>`, `Box<T>`, `Bound<T>`, `Wrapping<T>`, `Reverse<T>`, `(A, B)`, `(A, B, C)`, `(A, B, C, D)`, `(A, B, C, D, E)`, `Duration`, `HashMap<K, V>`, `BTreeMap<K, V>`, `HashSet<T>`, `BTreeSet<T>`, `BinaryHeap<T>`, `Result<T, E>`, `Cow<'_, T>`, `Decimal`, `regex::Regex`, `uuid::Uuid`, `chrono::Duration`, `chrono::DateTime<Utc>`, `geo::Point`, `geo::LineString` `geo::Polygon`, `geo::MultiPoint`, `geo::MultiLineString`, `geo::MultiPolygon`, and `ordered_float::NotNan`.

## Feature Flags

Revision supports the following feature flags:

- **`specialised-vectors`** (default): Enables specialised implementations for certain vector types that provide serialisation and deserialisation performance improvements.
- **`fixed-width-encoding`**: Uses fixed-width encoding for integers instead of variable-length encoding. By default, Revision uses variable-length encoding which is more space-efficient for small values but has overhead for large values. With this feature enabled, all integers use their full size (2 bytes for `u16`/`i16`, 4 bytes for `u32`/`i32`, 8 bytes for `u64`/`i64`, 16 bytes for `u128`/`i128`), providing predictable serialization sizes, and improved serialisation and deserialisation performance.
- **`skip`** (disabled by default): Enables `SkipRevisioned` / `SkipCheckRevisioned`, `skip_slice` / `skip_check_slice` (plus `skip_reader` / `skip_check_reader` aliases), slice fast paths, and matching derive output (`#[revisioned(..., skip = false)]` opts out per type). Library crates should forward `skip = ["revision/skip"]` and document `features = ["skip"]` for dependents; see **Skipping encoded values** below.

### Integer Encoding Trade-offs

**Variable-length encoding (default)**:
- Small values (0-250) use only 1 byte
- More compact for typical workloads with mostly small values
- Variable serialization size based on value magnitude
- Slight overhead for very large values

**Fixed-width encoding (`fixed-width-encoding` feature)**:
- Predictable, constant serialization size per type
- No branching or size checks during encoding/decoding
- Less compact for small values
- More efficient for workloads with large values

### Benchmarking

To compare variable-length vs fixed-width encoding performance:

```bash
# Benchmark with default variable-length encoding
cargo bench --bench varint_comparison

# Benchmark with fixed-width encoding
cargo bench --bench varint_comparison --features fixed-width-encoding
```

The `varint_comparison` benchmark tests serialization and deserialization performance across different data distributions (small values, large values, and mixed distributions) for all integer types.

## Inspiration

This code takes inspiration from the [Versionize](https://github.com/firecracker-microvm/versionize) library developed for [Amazon Firecracker](https://github.com/firecracker-microvm/firecracker) snapshot-restore development previews.

## Revision in action

```rust
use revision::Error;
use revision::revisioned;

// The test structure is at revision 3.
#[revisioned(revision = 3)]
#[derive(Debug, PartialEq)]
pub struct TestStruct {
    a: u32,
    #[revision(start = 2, end = 3, convert_fn = "convert_b")]
    b: u8,
    #[revision(start = 3)]
    c: u64,
    #[revision(start = 3, default_fn = "default_c")]
    d: String,
}

impl TestStruct {
    // Used to set the default value for a newly added field.
    fn default_c(_revision: u16) -> Result<String, Error> {
        Ok("test_string".to_owned())
    }
    // Used to convert the field from an old revision to the latest revision
    fn convert_b(&mut self, _revision: u16, value: u8) -> Result<(), Error> {
        self.c = value as u64;
        Ok(())
    }
}

// The test structure is at revision 3.
#[revisioned(revision = 3)]
#[derive(Debug, PartialEq)]
pub enum TestEnum {
    #[revision(end = 2, convert_fn = "upgrade_zero")]
    Zero,
    #[revision(end = 2, convert_fn = "upgrade_one")]
    One(u32),
    #[revision(start = 2)]
    Two(u64),
    #[revision(start = 2)]
    Three {
        a: i64,
        #[revision(end = 3, convert_fn = "upgrade_three_b")]
        b: f32,
        #[revision(start = 2)]
        c: rust_decimal::Decimal,
        #[revision(start = 3)]
        d: String,
    },
}

impl TestEnum {
    // Used to convert an old enum variant into a new variant.
    fn upgrade_zero(_: TestEnumZeroFields, _revision: u16) -> Result<TestEnum, Error> {
        Ok(Self::Two(0))
    }
    // Used to convert an old enum variant into a new variant.
    fn upgrade_one(f: TestEnumOneFields, _revision: u16) -> Result<TestEnum, Error> {
        Ok(Self::Two(f.0 as u64))
    }
    // Used to convert the field from an old revision to the latest revision
    fn upgrade_three_b(
        res: &mut TestEnumThreeFields,
        _revision: u16,
        value: f32,
    ) -> Result<(), Error> {
        res.c = value.into();
        Ok(())
    }
}
```

## Skipping encoded values

Use the **`skip`** feature when you handle revisioned bytes but only need to extract certain fields from the binary data - without deserializing full structs or maps into memory.

### Extracting one field from a struct

A `#[revisioned]` struct is laid out as **struct revision (`u16`)**, then **fields in source order**. Read only what you need and call `SkipRevisioned::skip_revisioned` on `&mut reader` for the rest (or use `skip_slice::<T>` to skip a whole nested value in one go when you have a sub-slice).

```rust
use revision::{DeserializeRevisioned, Error, SkipRevisioned, revisioned, to_vec};

#[revisioned(revision = 1)]
struct Row {
    // Large field we do not want to allocate when we only need `id`.
    blob: Vec<u8>,
    id: u64,
}

fn read_row_id_only(mut reader: &[u8]) -> Result<u64, Error> {
    let _struct_revision = u16::deserialize_revisioned(&mut reader)?;
    <Vec<u8> as SkipRevisioned>::skip_revisioned(&mut reader)?;
    u64::deserialize_revisioned(&mut reader)
}

let row = Row {
    blob: vec![1, 2, 3],
    id: 42,
};
let bytes = to_vec(&row).unwrap();
assert_eq!(read_row_id_only(&bytes).unwrap(), 42);
```

### Extracting one entry from a `BTreeMap`

Maps are encoded as **length (`usize`)**, then **key / value** pairs in sorted key order. Typical pattern: deserialize each key, compare, deserialize the value you care about, otherwise skip the value with the appropriate `skip_revisioned` call.

```rust
use revision::{DeserializeRevisioned, Error, SkipRevisioned, revisioned, to_vec};
use std::collections::BTreeMap;

#[revisioned(revision = 1)]
struct Config {
    values: BTreeMap<String, u64>,
}

fn get_u64(mut reader: &[u8], wanted: &str) -> Result<u64, Error> {
    let _struct_revision = u16::deserialize_revisioned(&mut reader)?;
    let n = usize::deserialize_revisioned(&mut reader)?;
    for _ in 0..n {
        let key = String::deserialize_revisioned(&mut reader)?;
        if key == wanted {
            return u64::deserialize_revisioned(&mut reader);
        }
        <u64 as SkipRevisioned>::skip_revisioned(&mut reader)?;
    }
    Err(Error::Deserialize(format!("missing key `{wanted}`")))
}

let cfg = Config {
    values: BTreeMap::from([
        ("noise".into(), 0),
        ("answer".into(), 99),
    ]),
};
let bytes = to_vec(&cfg).unwrap();
assert_eq!(get_u64(&bytes, "answer").unwrap(), 99);
```

For **map values that are themselves `#[revisioned]` enums or structs**, deserialize the discriminant / nested revision as you would when fully deserializing, and call `MyValue::skip_revisioned` on entries you discard (see `benches/skip_mixed_btreemap_nested.rs`).

Use **`skip_check_*`** when you want validation that matches stricter deserialize checks (e.g. UTF-8 for `String`). Disable skip for a type with `#[revisioned(revision = N, skip = false)]`.

## Walking encoded values

`WalkRevisioned` is a higher-level companion to `SkipRevisioned`: it lets a caller progress **element-by-element** through revisioned bytes, deciding per-element whether to **decode**, **skip**, or **walk into** further structure — without rewriting the byte-arithmetic by hand each time. The trait sits between `DeserializeRevisioned` (decode the entire value) and `SkipRevisioned` (consume the whole encoding).

The derive macro emits `WalkRevisioned` for every `#[revisioned(...)]` type by default (controlled by the same flag as `deserialize`). Opt out per type with `#[revisioned(revision = N, walk = false)]`.

For each `#[revisioned(...)]` type the derive emits a per-type walker (`<TypeName>Walker<'r, R>`) with named per-field / per-variant methods. This is in addition to the generic `StructWalker` / `EnumWalker` / `MapWalker` / `SeqWalker` types that hand-written `WalkRevisioned` impls can return.

### Walking a struct

```rust
use revision::{WalkRevisioned, revisioned, to_vec};

#[revisioned(revision = 1)]
struct Row {
    blob: Vec<u8>,
    id: u64,
}

fn read_row_id_only(mut reader: &[u8]) -> Result<u64, revision::Error> {
    let mut walker = Row::walk_revisioned(&mut reader)?;
    walker.skip_blob()?;
    walker.decode_id()
}
```

### Walking a map

`BTreeMap<K, V>` returns a `MapWalker` whose `next_entry` borrows one key/value pair at a time. Decode the key, then either decode/skip/walk the value before moving on:

```rust
use revision::{MapWalker, WalkRevisioned, to_vec};
use std::collections::BTreeMap;

let mut map: BTreeMap<String, u64> = BTreeMap::new();
map.insert("noise".into(), 0);
map.insert("answer".into(), 99);
let bytes = to_vec(&map).unwrap();

let mut reader = bytes.as_slice();
let mut walker: MapWalker<String, u64, _> = <BTreeMap<String, u64>>::walk_revisioned(&mut reader)?;
let mut found = None;
while let Some(mut entry) = walker.next_entry() {
    let k = entry.decode_key()?;
    if k == "answer" {
        found = Some(entry.decode_value()?);
    } else {
        entry.skip_value()?;
    }
}
assert_eq!(found, Some(99));
```

### Walking an enum

For each variant, the derive emits an `into_<variant>` consuming method that descends into the variant's payload (for unit and single-field tuple variants), and a per-revision `walk_revisioned_variant_name(wire_rev, disc)` lookup:

```rust
use revision::{WalkRevisioned, revisioned, to_vec};

#[revisioned(revision = 1)]
#[derive(Debug, PartialEq)]
enum Shape {
    Square(u32),
    Rectangle { w: u32, h: u32 },
    Circle(u32),
}

let bytes = to_vec(&Shape::Circle(7)).unwrap();
let mut reader = bytes.as_slice();
let walker = Shape::walk_revisioned(&mut reader)?;
if walker.is_circle() {
    let inner = walker.into_circle()?;
    let radius = inner.decode()?;
    assert_eq!(radius, 7);
}
```

### Walking across revisions

`WalkRevisioned` honours the same cross-revision contract as `DeserializeRevisioned`: any wire revision in `1..=current` is accepted, and the walker presents the **latest schema** view. The walker repr has up to four arms depending on the type:

- **Wire** (the fast path) is used when the wire revision matches the current schema, and for any older revision of a type that does **not** use `convert_fn`. Per-field methods branch on `wire_rev` against the field's `start` annotation: fields added after the wire revision are synthesised via `Default::default()` (or the user-supplied `default_fn`); no allocations.
- **IndexedBorrowed** (struct walker only) holds a borrowed slice over an `optimised` + `indexed_struct` payload. Per-field methods jump via the offset table in O(1); no allocations.
- **OptimisedBorrowed** (enum walker only) holds a borrowed slice over an `optimised` enum's variant body. Per-variant accessors read directly from the slice.
- **ConvertedOwned** is used when the wire revision differs from the current schema *and* the type has at least one `convert_fn`. The walker internally calls `Self::deserialize_revisioned` (which honours `convert_fn`), re-encodes the result at the current revision into an owned `Vec<u8>`, and then byte-walks those new bytes. The user-facing API is identical; the cost is a single `Vec<u8>` allocation plus the deserialize/serialize roundtrip.

The walker's repr is selected at construction; per-method code paths do not branch beyond a single match on the internal repr.

```rust
use revision::{WalkRevisioned, revisioned, to_vec};

#[revisioned(revision = 1)]
struct ShapeV1 {
    kind: u8,
}

#[revisioned(revision = 2)]
struct Shape {
    kind: u8,
    #[revision(start = 2)]
    flags: u8,
}

let bytes = to_vec(&ShapeV1 { kind: 3 }).unwrap();
let mut r = bytes.as_slice();
let mut walker = Shape::walk_revisioned(&mut r)?;
let kind = walker.decode_kind()?;   // exists at all revisions
let flags = walker.decode_flags()?; // synthesised default at wire rev 1
assert_eq!((kind, flags), (3, 0));
```

### Performance characteristics

| Path | Cost |
| --- | --- |
| Wire rev = current | identical to the current-rev hot path; per-field methods inline |
| Wire rev < current, type without `convert_fn` | one extra branch per field; allocation-free |
| Wire rev < current, type with `convert_fn` | `deserialize + serialize + walk`; rare in practice |

### Zero-copy peeking

When a walker visits a value whose wire format is `usize len || raw bytes` — a string, a `Vec<u8>`, a `PathBuf`, or any newtype wrapping one — the caller usually wants to compare those bytes against a needle, hash them, or stream them somewhere. Decoding the value just to throw the owned `String` / `Vec<u8>` / `Bytes` away is pure overhead.

Two small traits unlock zero-copy peeking on those payloads:

| Trait | Implemented for | Purpose |
| --- | --- | --- |
| [`BorrowedReader`] | `&[u8]`, [`SliceReader`] | A `Read` whose buffer is addressable, so a slice of upcoming bytes can be borrowed without copying. |
| [`LengthPrefixedBytes`] | `String`, `&str`, `Box<str>`, `Arc<str>`, `Cow<'_, str>`, `Vec<u8>`, `Vec<i8>`, `PathBuf`, `bytes::Bytes` (feature-gated), and downstream newtypes | Marker: this type's `SerializeRevisioned` writes exactly `usize len || raw bytes`. Does **not** apply to derived `#[revisioned(...)]` types — they prepend a `u16` revision header. |

When **both** are satisfied, walkers expose the following methods:

| Walker | Method | Reader bound | Element bound |
| --- | --- | --- | --- |
| [`LeafWalker<T>`] | [`with_bytes`] | `BorrowedReader` | `T: LengthPrefixedBytes` |
| [`MapWalker<K, V>`] | [`find_bytes`] | `BorrowedReader` | `K: LengthPrefixedBytes` |
| [`MapEntry<K, V>`] | [`with_key_bytes`] | `BorrowedReader` | `K: LengthPrefixedBytes` |
| [`MapEntry<K, V>`] | [`with_value_bytes`] | `BorrowedReader` | `V: LengthPrefixedBytes` |
| [`SeqItem<T>`] | [`with_bytes`] | `BorrowedReader` | `T: LengthPrefixedBytes` |

[`BorrowedReader`]: crate::BorrowedReader
[`LengthPrefixedBytes`]: crate::LengthPrefixedBytes
[`LeafWalker<T>`]: crate::LeafWalker
[`MapWalker<K, V>`]: crate::MapWalker
[`MapEntry<K, V>`]: crate::MapEntry
[`SeqItem<T>`]: crate::SeqItem
[`with_bytes`]: crate::LeafWalker::with_bytes
[`find_bytes`]: crate::MapWalker::find_bytes
[`with_key_bytes`]: crate::MapEntry::with_key_bytes
[`with_value_bytes`]: crate::MapEntry::with_value_bytes
[`DeserializeRevisioned`]: crate::DeserializeRevisioned
[`SkipRevisioned`]: crate::SkipRevisioned
[`Revisioned`]: crate::Revisioned
[`MapWalker::find`]: crate::MapWalker::find
[`LeafWalker`]: crate::LeafWalker
[`MapWalker`]: crate::MapWalker
[`next_entry`]: crate::MapWalker::next_entry

#### Worked example: matching a map key by raw bytes

`MapWalker::find_bytes` is the direct analogue of `find`, but the predicate sees the key's wire bytes instead of a decoded `K`:

```rust
use std::collections::BTreeMap;
use revision::{MapWalker, WalkRevisioned, to_vec};

let mut table = BTreeMap::new();
table.insert("alpha".to_string(), 1u32);
table.insert("delta".to_string(), 2);
table.insert("zeta".to_string(), 3);
let bytes = to_vec(&table).unwrap();

let mut r = bytes.as_slice();
let walker: MapWalker<String, u32, _> =
    <BTreeMap<String, u32>>::walk_revisioned(&mut r).unwrap();

// Compare keys as `&[u8]` — no Strand / String allocated per visit.
let value = walker
    .find_bytes(|k| k.cmp(b"delta".as_slice()))
    .unwrap()
    .map(|leaf| leaf.decode())
    .transpose()
    .unwrap();

assert_eq!(value, Some(2));
```

#### Worked example: peeking a single key during streaming iteration

`MapEntry::with_key_bytes` is the per-entry counterpart. Use it when iterating with `next_entry` and you want to decide what to do with the value based on the key's bytes:

```rust
use std::collections::BTreeMap;
use revision::{MapWalker, WalkRevisioned, to_vec};

let mut table = BTreeMap::new();
table.insert("alpha".to_string(), 1u32);
table.insert("beta".to_string(), 2);
table.insert("gamma".to_string(), 3);
let bytes = to_vec(&table).unwrap();

let mut r = bytes.as_slice();
let mut walker: MapWalker<String, u32, _> =
    <BTreeMap<String, u32>>::walk_revisioned(&mut r).unwrap();

let mut beta = None;
while let Some(mut entry) = walker.next_entry() {
    let is_target = entry.with_key_bytes(|k| k == b"beta").unwrap();
    if is_target {
        beta = Some(entry.decode_value().unwrap());
    } else {
        entry.skip_value().unwrap();
    }
}
assert_eq!(beta, Some(2));
```

#### Worked example: filtering a map by value bytes

`MapEntry::with_value_bytes` mirrors `with_key_bytes` for the value slot. Useful when the key has already been handled (decoded or skipped) and the caller wants to filter based on the value's raw bytes:

```rust
use std::collections::BTreeMap;
use revision::{MapWalker, WalkRevisioned, to_vec};

let mut table: BTreeMap<String, Vec<u8>> = BTreeMap::new();
table.insert("a".into(), b"first-value".to_vec());
table.insert("b".into(), b"target-value".to_vec());
let bytes = to_vec(&table).unwrap();

let mut r = bytes.as_slice();
let mut walker: MapWalker<String, Vec<u8>, _> =
    <BTreeMap<String, Vec<u8>>>::walk_revisioned(&mut r).unwrap();

let mut hits = 0;
while let Some(mut entry) = walker.next_entry() {
    entry.skip_key().unwrap();
    if entry.with_value_bytes(|raw| raw.starts_with(b"target")).unwrap() {
        hits += 1;
    }
}
assert_eq!(hits, 1);
```

#### Worked example: scanning a sequence of strings

`SeqItem::with_bytes` lets a scan over `Vec<String>` (or any `SeqWalker` whose item type implements `LengthPrefixedBytes`) compare items as raw bytes without paying for a per-item allocation:

```rust
use revision::{SeqWalker, WalkRevisioned, to_vec};

let v = vec!["alpha".to_string(), "beta".into(), "gamma".into()];
let bytes = to_vec(&v).unwrap();

let mut r = bytes.as_slice();
let mut walker: SeqWalker<String, _> =
    <Vec<String>>::walk_revisioned(&mut r).unwrap();

let mut found = false;
while let Some(item) = walker.next_item() {
    if item.with_bytes(|s| s == b"beta").unwrap() {
        found = true;
    }
}
assert!(found);
```

#### When zero-copy peeking does **not** apply

- The reader is a streaming source (`std::fs::File`, `TcpStream`, …). `BorrowedReader` is only implemented for slice-backed readers.
- The element type is a derived `#[revisioned(...)]` type. Its wire format includes a `u16` revision header followed by the body, not bare length-prefixed bytes; use `decode` / `walk` and let the walker read past the header.
- The element is a primitive numeric (`u32`, `f64`, …) or a fixed-size array. There is no length prefix; the wire bytes are the value bytes. Use `decode` directly.

### Limitations

- **Untrusted inputs:** Wire lengths are `usize` length prefixes like everywhere else in `revision`; they bound how much is read, skipped, or materialised. Walkers add **no** extra caps or validation — same trust model as [`DeserializeRevisioned`] / [`SkipRevisioned`].
- **[`MapWalker::find`] / [`find_bytes`]:** On a match you only get a [`LeafWalker`] for that entry's value. The method consumes the [`MapWalker`]; you cannot resume [`next_entry`] on it. Key–value pairs that sort after the match remain on the underlying reader for other callers, not for the same walker instance (by design). Both methods assume **wire visit order matches sorted-map encoding** (as when serialising `BTreeMap`). Using an ordering predicate on bytes produced from unsorted maps (`HashMap` insertion order, …) can match incorrectly or discard the tail under `Ordering::Greater`.
- **[`LengthPrefixedBytes`] on custom types:** The marker must match the type's real `SerializeRevisioned` layout (`usize len || raw bytes`). A wrong impl breaks [`with_bytes`] / [`find_bytes`] and related paths — it is an explicit contract, not something the library can detect (same class of risk as any incorrect [`Revisioned`] impl).

- The derive emits two flavours of nested walk per field. `walk_<field>(&mut self)` borrows the parent walker so the caller can keep reading siblings after the sub-walker is dropped. `into_walk_<field>(self)` consumes the parent and hands the reader to the sub-walker for the original `'r`, trading sibling access for a longer-lived sub-walker. Both error with `Error::Conversion` on the `ConvertedOwned` repr (older revs of `convert_fn`-bearing types); callers that hit that path should `decode_<field>` instead.
- `into_<variant>` is currently emitted for unit variants and single-field tuple variants. Multi-field tuple variants and struct variants are reachable via `discriminant()` + `decode_<field>` on the underlying bytes.
- `Vec<T>` uses `specialised-vectors` bulk encoding for several element types when that Cargo feature is enabled (the default): primitives, `bool`, and — if the optional `uuid` / `rust_decimal` crate features are also enabled — `uuid::Uuid` and `rust_decimal::Decimal` (see `try_specialized!` in `src/implementations/vecs.rs`). `Vec<T>::walk_revisioned` rejects each such `T` with [`Error::Deserialize`] **before** reading the sequence length, leaving the reader unchanged — use [`DeserializeRevisioned`] or [`SkipRevisioned`] instead. With `specialised-vectors` disabled, every `Vec<T>` uses per-element layout and is safe to walk. `HashSet<T>`, `BTreeSet<T>`, `BinaryHeap<T>`, and the `imbl` collections always use per-element framing, so they are walkable regardless of element type.
- [`MapEntry`] methods enforce key/value ordering in every build: calling `decode_value` before `decode_key` / `skip_key`, or repeating `decode_key`, returns [`Error::Deserialize`] without advancing the reader when the check fails before I/O.
- [`SeqItem::walk`], [`MapEntry::walk_value`], and [`StructWalker::walk`] advance counters (`remaining`, `position`) only after `walk_revisioned` succeeds, so a failed nested walk does not desynchronise the parent walker from the byte stream.
- A type using `convert_fn` requires both `serialize = true` and `deserialize = true` for `walk` to be derivable (the default). The derive errors at compile time if `walk = true` is combined with either disabled, since the `ConvertedOwned` cross-revision path needs to deserialize at the wire revision and re-serialize at the current revision. Set `walk = false` on such a type if you don't need walker support.
- `Cow<'_, T>` is treated as opaque by the walker. Its `Walker` is a `LeafWalker<T::Owned>`, so `decode()` returns `T::Owned` (e.g. `String` for `Cow<'_, str>`), not a `Cow`. Use `DeserializeRevisioned` if you need a `Cow` back, or descend through `T::Owned::walk_revisioned` directly.

## Optimised wire format

`revision` 0.23 introduces an opt-in **optimised** wire format that
trades the default varint+sequential layout for a more compact tagged
envelope with O(1) skip and optional O(1)/O(log n) random access.
Types declare which revisions use it via the **history syntax**:

```rust,ignore
#[revisioned(
    revision(1),                                      // legacy layout
    revision(2, optimised),              // tagged envelope
    revision(3, optimised, indexed_struct),
)]
struct Wide { /* fields */ }
```

Legacy `#[revisioned(revision = N)]` keeps working — it is normalised
internally to `revision(1), revision(2), ..., revision(N)` all-legacy.
The parser distinguishes the two by peeking the next token after the
`revision` keyword (`=` for legacy, `(` for the new function-call
form).

### History semantics

- Revisions are strict-append. Numbers must run `1..=N` with no gaps
  and no duplicates; the parser errors at the call site otherwise.
- Mixing `revision = N` with `revision(N)` on the same type is a
  compile error.
- Encoding-specific attributes (`indexed_struct` and the per-field
  `indexed_map` / `indexed_seq` / `indexed_set` markers) require the
  `optimised` flag on the same revision entry.

### Wire layout (per-entry)

A type's outer envelope still begins with the `u16` revision varint.
Under `optimised` the body that follows is:

```text
struct:  u32_le payload_length || [optional u32_le; field_count] || fields
enum:    u8 tag                || payload per size class
```

The 1-byte enum tag packs the variant id (bits 0..=4) with a size
class (bits 5..=6):

| size class | bits | payload format |
| --- | --- | --- |
| Inline   | `0b00` | (nothing — tag is the whole encoding) |
| Fixed    | `0b01` | static byte count from `#[revision(size = "fixed(N)")]` |
| Varlen   | `0b10` | `u32_le length || body` |
| Reserved | `0b11` | decode error: `InvalidOptimisedTag` |

Every variant of an optimised enum must declare its size class via
`#[revision(size = "inline" | "fixed(N)" | "varlen")]`. Variant id is
the existing `CalcDiscriminant` output validated to fit in 5 bits;
optimised enums may have at most 32 variants alive at any revision.

### Indexed prologues

`indexed_struct` prepends `[u32_le; field_count]` to the payload
so a walker can jump to any field in O(1). The encoder buffers fields
into a scratch `Vec<u8>` to learn each field's offset, then emits the
prologue and body in a single pass. Indexed encoding for individual
map/seq/set fields uses the per-field attributes
`#[revision(indexed_map)]` / `#[revision(indexed_seq)]` /
`#[revision(indexed_set)]` instead — the type-level `map = "indexed"`
and `seq = "indexed"` forms are rejected at parse time with a
diagnostic pointing at the per-field variant.

`OFFSET_TABLE_MIN_LEN = 8` is the minimum entry count that triggers
the prologue; below it the encoder falls back to a sequential body
and the walker falls back to a linear scan.

### Validation

Indexed compounds validate their prologue eagerly on walker
construction:

- Offsets are strictly monotonic
- Every offset is in-range for the payload
- Indexed-map keys are strictly ascending (byte compare)

Corrupt payloads surface as typed `Error` variants — `InvalidOptimisedTag`,
`OptimisedOffsetOutOfRange { offset, payload_len }`,
`OptimisedOffsetsNonMonotonic`, `OptimisedKeyRegionNotAscending`,
`OptimisedSubReaderOverrun` — never as panics. `Error` is
`#[non_exhaustive]` so future variants do not break exhaustive matches.

### Runtime requirement: BorrowedReader

The indexed walkers (`IndexedStructWalker`, `IndexedMapWalker`,
`IndexedSeqWalker`) borrow from a `&[u8]` payload. To carve that
payload out of a streaming `Read` source they require
`BorrowedReader`. `&[u8]` and `SliceReader` implement it; pure
streaming readers (file, socket) fall through to a materialised path
that allocates.

### Backward compatibility

| scenario | result |
| --- | --- |
| new code reads old rev-N legacy data | ✓ legacy decode arm |
| new code reads new rev-M optimised data | ✓ optimised decode arm |
| mixed legacy/optimised records on disk | ✓ per-record dispatch on embedded `u16` revision |
| old code reads new rev-M optimised data | ✗ fails on unknown revision (forward-only, accepted) |
| in-memory shape across revisions | ✓ every decoder for every revision produces the same shape |

### Worked example: migrating a struct from legacy to optimised

A type that started life as a single legacy revision and is now being
opted into the optimised encoding for new writes:

```rust,ignore
// Before — single legacy revision:
#[revisioned(revision = 1)]
struct Profile {
    id: u32,
    handle: String,
    bio: String,
}

// After — two revisions, the new one uses optimised:
#[revisioned(
    revision(1),                                        // existing on-disk data
    revision(2, optimised, indexed_struct),
)]
struct Profile {
    id: u32,
    handle: String,
    bio: String,
}
```

What changes:

- Existing rev-1 bytes on disk continue to decode through the
  `revision(1)` arm — the macro normalises both the legacy
  `revision = 1` form and the explicit `revision(1)` form to the same
  internal legacy entry, so no on-disk migration is needed.
- All new writes serialise at rev 2: `u16 2 | u32_le payload_length |
  [u32_le; 3] offset prologue | id | handle | bio`. Reading those new
  bytes is automatic — the macro emits one decode arm per history
  entry.
- A walker constructed from any rev-1 or rev-2 byte stream exposes
  the same per-field methods (`decode_id`, `decode_handle`,
  `decode_bio`). Skip is O(1) on rev-2 (one `u32_le` read + advance)
  regardless of how big `bio` got.

### Indexed-map / indexed-seq / indexed-set fields

For `BTreeMap` / `Vec` / `BTreeSet`-shaped fields that benefit from
O(log n) key lookup or random-access metadata on the wire, opt the
field into the indexed encoding via one of the three per-field
attributes:

```rust,ignore
use std::collections::{BTreeMap, BTreeSet};
use revision::prelude::*;

#[revisioned(revision(1, optimised))]
struct Doc {
    id: u32,
    #[revision(indexed_map)]
    fields: BTreeMap<String, Value>,   // walker can binary-search keys
    summary: String,                    // default optimised serialisation
    #[revision(indexed_seq)]
    tags: Vec<String>,                  // offset-table seq
    #[revision(indexed_set)]
    roles: BTreeSet<String>,            // sorted-bytes set; membership via walker
}
```

Each per-field attribute routes through its trait:

| Attribute | Trait | Implemented for |
| --- | --- | --- |
| `indexed_map` | [`IndexedMapEncoded`] | `BTreeMap`, `HashMap`, `imbl::OrdMap`, `imbl::HashMap` |
| `indexed_seq` | [`IndexedSeqEncoded`] | `Vec`, `imbl::Vector` |
| `indexed_set` | [`IndexedSetEncoded`] | `BTreeSet`, `HashSet`, `imbl::OrdSet`, `imbl::HashSet` |

Custom container types can implement the relevant trait to participate.
Hash-based containers (`HashMap`, `HashSet`) sort entries by serialised
key bytes on encode so the wire layout is binary-searchable on read.

At most one of these attributes may be set per field — the macro
errors at compile time if you declare more than one.

[`IndexedMapEncoded`]: crate::optimised::indexed::IndexedMapEncoded
[`IndexedSeqEncoded`]: crate::optimised::indexed::IndexedSeqEncoded
[`IndexedSetEncoded`]: crate::optimised::indexed::IndexedSetEncoded

Hand-rolled `SerializeRevisioned` impls can call the free helpers
directly:

```rust,ignore
use revision::optimised::indexed::{serialize_indexed_map, IndexedMapWalker};

let mut bytes = Vec::new();
serialize_indexed_map(&my_map, &mut bytes).unwrap();

// Reader side: binary-search a key without allocating the map.
let w: IndexedMapWalker<String, u32> =
    IndexedMapWalker::from_payload(&bytes).unwrap();
let target = "bravo".as_bytes();
let value = w.find_value_bytes(|k| k.cmp(target))?.unwrap();
```

Note: the encoder sorts entries by their **serialised key bytes** before
writing, which can differ from K-order when the key's
`SerializeRevisioned` emits a length prefix that varies across keys (as
`String` does). Round-trip is preserved because `BTreeMap`'s
`DeserializeRevisioned` re-inserts entries into K-order anyway.

### Worked example: an enum under the optimised tag

Tag size class tells the codec how to read each variant's payload.
Inline variants are one byte total on the wire; varlen variants
carry a `u32_le` length so skip is O(1).

```rust,ignore
#[revisioned(revision(1, optimised))]
enum Event {
    #[revision(size = "inline")]
    Heartbeat,
    #[revision(size = "fixed(16)")]
    Uuid(uuid::Uuid),                 // exactly 16 bytes on the wire
    #[revision(size = "varlen")]
    Message(String),                  // u32_le length + bytes
}

// Skim variants without materialising the payload:
let bytes = revision::to_vec(&event).unwrap();
let mut r: &[u8] = &bytes;
let walker = Event::walk_revisioned(&mut r)?;

if walker.is_heartbeat() {
    // No-op; the tag was 1 byte total.
} else if walker.is_message() {
    let text = walker.decode_message()?;   // reads u32_le len, slurps body
    // ...
}
```

The `decode_<variant>` accessor works on every walker repr (Wire,
OptimisedBorrowed, ConvertedOwned) — the recommended path for
surrealdb-style filters that peek the variant before deciding whether
to fully decode.

### Limitations (current iteration)

- **Walker on optimised enums** exposes `discriminant()`,
  `is_<variant>()`, and `decode_<variant>(self)` directly. The
  consuming `into_<variant>` accessor (returning a borrowed
  sub-walker) errors on the `OptimisedBorrowed` and `ConvertedOwned`
  paths — that's the `Walker<'r, R>` GAT lifetime trap; use
  `<variant>_view(self) -> VariantView<'r, T>` to get the variant
  payload bytes (borrowed from the source in the common
  `OptimisedBorrowed` case), then construct your own walker from
  `view.as_bytes()` if needed.
- The type-level `map = "indexed"` / `seq = "indexed"` attributes are
  rejected at parse time — they're impossible to implement soundly
  without specialisation (the macro can't tell `BTreeMap` from any
  other field type). Use the per-field `#[revision(indexed_map)]` /
  `#[revision(indexed_seq)]` / `#[revision(indexed_set)]` attributes
  instead. They work today.
- `fixed(N)` requires the variant body to serialise to exactly `N`
  bytes under `SerializeRevisioned`. Use `[u8; N]`, `Uuid`, fixed-
  width primitives under `fixed-width-encoding`, etc. — varint-encoded
  primitives have variable length and won't match. The macro emits a
  `debug_assert_eq!` in the encode arm to catch declared-vs-actual
  size mismatches.

### Attribute spelling convention

The optimised wire format adds several attributes; they follow two
shapes depending on what they declare:

- **Opt-in flags** are bare keywords because they're booleans —
  presence means "yes", absence means "no". Currently:
  `optimised` and `indexed_struct` at the revision level
  (inside `#[revisioned(revision(N, ...))]`); `indexed_map`,
  `indexed_seq`, `indexed_set`, `fixed`, `specialised` at the
  field level (inside `#[revision(...)]` on a field). Mixing
  two indexed-* markers for one field is a compile error.
- **Parameterised options** use `key = "value"` pairs because the
  value carries information beyond on/off: `size = "inline" |
  "fixed(N)" | "varlen"` on optimised-enum variants picks one
  of three classes (with an embedded byte count for `fixed`);
  `start = N`, `end = N`, `convert_fn = "..."`,
  `default_fn = "..."`, `fields_name = "..."` likewise take a
  parameter.

This split mirrors how Rust's own `#[cfg(...)]` works: `cfg(test)`
is a flag, `cfg(target_os = "linux")` is a configuration value.