zerovec 0.10.1

Zero-copy vector backed by a byte array
Documentation
# zerovec [![crates.io]https://img.shields.io/crates/v/zerovec]https://crates.io/crates/zerovec

<!-- cargo-rdme start -->

Zero-copy vector abstractions for arbitrary types, backed by byte slices.

`zerovec` enables a far wider range of types — beyond just `&[u8]` and `&str` — to participate in
zero-copy deserialization from byte slices. It is `serde` compatible and comes equipped with
proc macros

Clients upgrading to `zerovec` benefit from zero heap allocations when deserializing
read-only data.

This crate has four main types:

- [`ZeroVec<'a, T>`] and [`ZeroSlice<T>`](ZeroSlice) for fixed-width types like `u32`
- [`VarZeroVec<'a, T>`] and [`VarZeroSlice<T>`](ZeroSlice) for variable-width types like `str`
- [`ZeroMap<'a, K, V>`] to map from `K` to `V`
- [`ZeroMap2d<'a, K0, K1, V>`] to map from the pair `(K0, K1)` to `V`

The first two are intended as close-to-drop-in replacements for `Vec<T>` in Serde structs. The third and fourth are
intended as a replacement for `HashMap` or [`LiteMap`](docs.rs/litemap). When used with Serde derives, **be sure to apply
`#[serde(borrow)]` to these types**, same as one would for [`Cow<'a, T>`].

[`ZeroVec<'a, T>`], [`VarZeroVec<'a, T>`], [`ZeroMap<'a, K, V>`], and [`ZeroMap2d<'a, K0, K1, V>`] all behave like
[`Cow<'a, T>`] in that they abstract over either borrowed or owned data. When performing deserialization
from human-readable formats (like `json` and `xml`), typically these types will allocate and fully own their data, whereas if deserializing
from binary formats like `bincode` and `postcard`, these types will borrow data directly from the buffer being deserialized from,
avoiding allocations and only performing validity checks. As such, this crate can be pretty fast (see [below](#Performance) for more information)
on deserialization.

See [the design doc](https://github.com/unicode-org/icu4x/blob/main/utils/zerovec/design_doc.md) for details on how this crate
works under the hood.

## Cargo features

This crate has several optional Cargo features:
 -  `serde`: Allows serializing and deserializing `zerovec`'s abstractions via [`serde`]https://docs.rs/serde
 -   `yoke`: Enables implementations of `Yokeable` from the [`yoke`]https://docs.rs/yoke/ crate, which is also useful
             in situations involving a lot of zero-copy deserialization.
 - `derive`: Makes it easier to use custom types in these collections by providing the `#[make_ule]` and
    `#[make_varule]` proc macros, which generate appropriate [`ULE`]https://docs.rs/zerovec/latest/zerovec/ule/trait.ULE.html and
    [`VarULE`]https://docs.rs/zerovec/latest/zerovec/ule/trait.VarULE.html-conformant types for a given "normal" type.
 - `std`: Enabled `std::Error` implementations for error types. This crate is by default `no_std` with a dependency on `alloc`.

[`ZeroVec<'a, T>`]: ZeroVec
[`VarZeroVec<'a, T>`]: VarZeroVec
[`ZeroMap<'a, K, V>`]: ZeroMap
[`ZeroMap2d<'a, K0, K1, V>`]: ZeroMap2d
[`Cow<'a, T>`]: alloc::borrow::Cow

## Examples

Serialize and deserialize a struct with ZeroVec and VarZeroVec with Bincode:

```rust
use zerovec::{VarZeroVec, ZeroVec};

// This example requires the "serde" feature
#[derive(serde::Serialize, serde::Deserialize)]
pub struct DataStruct<'data> {
    #[serde(borrow)]
    nums: ZeroVec<'data, u32>,
    #[serde(borrow)]
    chars: ZeroVec<'data, char>,
    #[serde(borrow)]
    strs: VarZeroVec<'data, str>,
}

let data = DataStruct {
    nums: ZeroVec::from_slice_or_alloc(&[211, 281, 421, 461]),
    chars: ZeroVec::alloc_from_slice(&['ö', '冇', 'म']),
    strs: VarZeroVec::from(&["hello", "world"]),
};
let bincode_bytes =
    bincode::serialize(&data).expect("Serialization should be successful");
assert_eq!(bincode_bytes.len(), 67);

let deserialized: DataStruct = bincode::deserialize(&bincode_bytes)
    .expect("Deserialization should be successful");
assert_eq!(deserialized.nums.first(), Some(211));
assert_eq!(deserialized.chars.get(1), Some('冇'));
assert_eq!(deserialized.strs.get(1), Some("world"));
// The deserialization will not have allocated anything
assert!(!deserialized.nums.is_owned());
```

Use custom types inside of ZeroVec:

```rust
use zerovec::{ZeroVec, VarZeroVec, ZeroMap};
use std::borrow::Cow;
use zerovec::ule::encode_varule_to_box;

// custom fixed-size ULE type for ZeroVec
#[zerovec::make_ule(DateULE)]
#[derive(Copy, Clone, PartialEq, Eq, Ord, PartialOrd, serde::Serialize, serde::Deserialize)]
struct Date {
    y: u64,
    m: u8,
    d: u8
}

// custom variable sized VarULE type for VarZeroVec
#[zerovec::make_varule(PersonULE)]
#[zerovec::derive(Serialize, Deserialize)] // add Serde impls to PersonULE
#[derive(Clone, PartialEq, Eq, Ord, PartialOrd, serde::Serialize, serde::Deserialize)]
struct Person<'a> {
    birthday: Date,
    favorite_character: char,
    #[serde(borrow)]
    name: Cow<'a, str>,
}

#[derive(serde::Serialize, serde::Deserialize)]
struct Data<'a> {
    #[serde(borrow)]
    important_dates: ZeroVec<'a, Date>,
    // note: VarZeroVec always must reference the ULE type directly
    #[serde(borrow)]
    important_people: VarZeroVec<'a, PersonULE>,
    #[serde(borrow)]
    birthdays_to_people: ZeroMap<'a, Date, PersonULE>
}


let person1 = Person {
    birthday: Date { y: 1990, m: 9, d: 7},
    favorite_character: 'π',
    name: Cow::from("Kate")
};
let person2 = Person {
    birthday: Date { y: 1960, m: 5, d: 25},
    favorite_character: '冇',
    name: Cow::from("Jesse")
};

let important_dates = ZeroVec::alloc_from_slice(&[Date { y: 1943, m: 3, d: 20}, Date { y: 1976, m: 8, d: 2}, Date { y: 1998, m: 2, d: 15}]);
let important_people = VarZeroVec::from(&[&person1, &person2]);
let mut birthdays_to_people: ZeroMap<Date, PersonULE> = ZeroMap::new();
// `.insert_var_v()` is slightly more convenient over `.insert()` for custom ULE types
birthdays_to_people.insert_var_v(&person1.birthday, &person1);
birthdays_to_people.insert_var_v(&person2.birthday, &person2);

let data = Data { important_dates, important_people, birthdays_to_people };

let bincode_bytes = bincode::serialize(&data)
    .expect("Serialization should be successful");
assert_eq!(bincode_bytes.len(), 168);

let deserialized: Data = bincode::deserialize(&bincode_bytes)
    .expect("Deserialization should be successful");

assert_eq!(deserialized.important_dates.get(0).unwrap().y, 1943);
assert_eq!(&deserialized.important_people.get(1).unwrap().name, "Jesse");
assert_eq!(&deserialized.important_people.get(0).unwrap().name, "Kate");
assert_eq!(&deserialized.birthdays_to_people.get(&person1.birthday).unwrap().name, "Kate");

} // feature = serde and derive
```

## Performance

`zerovec` is designed for fast deserialization from byte buffers with zero memory allocations
while minimizing performance regressions for common vector operations.

Benchmark results on x86_64:

| Operation | `Vec<T>` | `zerovec` |
|---|---|---|
| Deserialize vec of 100 `u32` | 233.18 ns | 14.120 ns |
| Compute sum of vec of 100 `u32` (read every element) | 8.7472 ns | 10.775 ns |
| Binary search vec of 1000 `u32` 50 times | 442.80 ns | 472.51 ns |
| Deserialize vec of 100 strings | 7.3740 μs\* | 1.4495 μs |
| Count chars in vec of 100 strings (read every element) | 747.50 ns | 955.28 ns |
| Binary search vec of 500 strings 10 times | 466.09 ns | 790.33 ns |

\* *This result is reported for `Vec<String>`. However, Serde also supports deserializing to the partially-zero-copy `Vec<&str>`; this gives 1.8420 μs, much faster than `Vec<String>` but a bit slower than `zerovec`.*

| Operation | `HashMap<K,V>`  | `LiteMap<K,V>` | `ZeroMap<K,V>` |
|---|---|---|---|
| Deserialize a small map | 2.72 μs | 1.28 μs | 480 ns |
| Deserialize a large map | 50.5 ms | 18.3 ms | 3.74 ms |
| Look up from a small deserialized map | 49 ns | 42 ns | 54 ns |
| Look up from a large deserialized map | 51 ns | 155 ns | 213 ns |

Small = 16 elements, large = 131,072 elements. Maps contain `<String, String>`.

The benches used to generate the above table can be found in the `benches` directory in the project repository.
`zeromap` benches are named by convention, e.g. `zeromap/deserialize/small`, `zeromap/lookup/large`. The type
is appended for baseline comparisons, e.g. `zeromap/lookup/small/hashmap`.

<!-- cargo-rdme end -->

## More Information

For more information on development, authorship, contributing etc. please visit [`ICU4X home page`](https://github.com/unicode-org/icu4x).