# 📦 lencode
[](https://crates.io/crates/lencode)
[](https://docs.rs/lencode)
[](https://github.com/sam0x17/lencode/actions/workflows/ci.yaml)
[](https://github.com/sam0x17/lencode/actions/workflows/big-endian.yml)
[](https://opensource.org/licenses/MIT)
Compact, fast binary encoding with varints, optional deduplication, and opportunistic zstd compression for bytes and strings. `no_std` by default, with an opt‑in `std` feature.
## Highlights
- Fast varints: efficient for small and large integers
- Optional deduplication: replace repeats with compact IDs for supported types
- Bytes/strings compression: flagged header + zstd when smaller; high‑entropy data is detected and skipped automatically
- Bulk encoding: `Vec<T>` of fixed‑size types (e.g. `[u8; 32]`) are encoded/decoded via bulk `memcpy`, not per‑element
- no_std + alloc: works without `std` (uses `zstd-safe`)
- Derive macros: `#[derive(Encode, Decode)]` for your types, `#[derive(Pack)]` for dedupe/bulk types
- Solana support: feature `solana` adds v2/v3 SDK types
- Big-endian ready: CI runs tests on s390x
## Install
```toml
[dependencies]
lencode = "0.1"
# With standard library types (e.g., Cow)
lencode = { version = "0.1", features = ["std"] }
# With Solana type support (implies std)
lencode = { version = "0.1", features = ["solana"] }
```
## Quick start
### Derive and round‑trip
```rust
use lencode::prelude::*;
#[derive(Encode, Decode, PartialEq, Debug)]
struct Point { x: u64, y: u64 }
let p = Point { x: 3, y: 5 };
let mut buf = Vec::new();
encode(&p, &mut buf)?;
let q: Point = decode(&mut Cursor::new(&buf))?;
assert_eq!(p, q);
```
### Collections and primitives
```rust
use lencode::prelude::*;
let values: Vec<u128> = (0..10).collect();
let mut buf = Vec::new();
encode(&values, &mut buf)?;
let rt: Vec<u128> = decode(&mut Cursor::new(&buf))?;
assert_eq!(values, rt);
```
### Deduplication (optional)
To benefit from deduplication for your own types, implement `Pack` and the marker traits and pass the encoder/decoder via `encode_ext`/`decode_ext`.
```rust
use lencode::prelude::*;
#[derive(Clone, Copy, PartialEq, Eq, Debug, Hash)]
struct MyId(u32);
impl Pack for MyId {
fn pack(&self, w: &mut impl Write) -> Result<usize> { self.0.pack(w) }
fn unpack(r: &mut impl Read) -> Result<Self> { Ok(Self(u32::unpack(r)?)) }
}
impl DedupeEncodeable for MyId {}
impl DedupeDecodeable for MyId {}
let vals = vec![MyId(42), MyId(7), MyId(42), MyId(7), MyId(42)];
// Encode with deduplication enabled
let mut enc_ctx = EncoderContext::with_dedupe();
let mut buf = Vec::new();
encode_ext(&vals, &mut buf, Some(&mut enc_ctx))?;
// Decode with deduplication enabled
let mut dec_ctx = DecoderContext::with_dedupe();
let roundtrip: Vec<MyId> = decode_ext(&mut Cursor::new(&buf), Some(&mut dec_ctx))?;
assert_eq!(roundtrip, vals);
```
### Compact bytes and strings
`&[u8]`, `Vec<u8]`, `VecDeque<u8]`, `&str`, and `String` use a compact flagged header: `varint((payload_len << 1) | flag) + payload`.
- `flag = 0` → raw bytes/UTF‑8
- `flag = 1` → zstd frame (original size stored inside the frame)
The encoder picks whichever is smaller per value. High‑entropy data (random bytes, encrypted content) is detected via a fast entropy check and skips compression entirely.
### Bulk encoding for fixed‑size types
`Vec<T>` where `T` has a fixed‑size wire representation (e.g. `[u8; 32]`, or `#[repr(transparent)]` newtypes over byte arrays) is encoded and decoded via bulk `memcpy` rather than per‑element iteration. This is handled automatically through `Encode::encode_slice` / `Decode::decode_vec` and their `Pack` counterparts `Pack::pack_slice` / `Pack::unpack_vec`.
Custom `Pack` types can opt in by overriding `pack_slice` and `unpack_vec`, or by using `#[derive(Pack)]` on a `#[repr(transparent)]` single‑field struct, which generates the bulk overrides automatically:
```rust
use lencode::prelude::*;
#[repr(transparent)]
#[derive(Clone, Copy, PartialEq, Eq, Hash, Pack)]
struct MyPubkey([u8; 32]);
impl DedupeEncodeable for MyPubkey {}
impl DedupeDecodeable for MyPubkey {}
```
### Incremental diff encoding
`DiffEncoder`/`DiffDecoder` provide stateful delta encoding for keyed byte blobs. When the same key is re‑encoded, only the diff is emitted. Two strategies are tried automatically and the smaller output is picked:
- **RLE patches** — run‑length‑encoded list of changed regions (fast, best for sparse changes)
- **XOR + zstd** — XOR old and new blobs, then zstd‑compress the result (best for scattered changes)
Supported byte types: `Vec<u8>`, `&[u8]`, `[u8; N]`, `VecDeque<u8>`. Set a key via `set_key()` on the diff encoder/decoder before each `encode_ext`/`decode_ext` call to opt in.
```rust
use lencode::prelude::*;
use lencode::context::{EncoderContext, DecoderContext};
use lencode::diff::{DiffEncoder, DiffDecoder};
let key = 1u64;
let mut enc_ctx = EncoderContext { dedupe: None, diff: Some(DiffEncoder::new()) };
let mut dec_ctx = DecoderContext { dedupe: None, diff: Some(DiffDecoder::new()) };
// First encode (full blob)
let data1: Vec<u8> = vec![0xAA; 2048];
let mut buf = Vec::new();
enc_ctx.diff.as_mut().unwrap().set_key(key);
dec_ctx.diff.as_mut().unwrap().set_key(key);
data1.encode_ext(&mut buf, Some(&mut enc_ctx)).unwrap();
// Second encode (only the diff is written)
let mut data2 = data1.clone();
data2[100] = 0xFF;
buf.clear();
enc_ctx.diff.as_mut().unwrap().set_key(key);
dec_ctx.diff.as_mut().unwrap().set_key(key);
data2.encode_ext(&mut buf, Some(&mut enc_ctx)).unwrap();
assert!(buf.len() < data2.len() / 2); // diff is much smaller
let mut cursor = Cursor::new(&buf[..]);
let result: Vec<u8> = Vec::decode_ext(&mut cursor, Some(&mut dec_ctx)).unwrap();
assert_eq!(result, data2);
```
### Writer pre‑allocation
The `Write` trait provides a `reserve(additional)` hint. Growable writers like `VecWriter` use this to pre‑allocate capacity before encoding large collections, reducing intermediate reallocations.
## Supported types
- Primitives: all ints, `bool`, `f32`, `f64`
- Arrays: `[T; N]`
- Option: `Option<T>`
- Bytes/strings: `&[u8]`, `Vec<u8]`, `VecDeque<u8]`, `&str`, `String`
- Collections (alloc): `Vec<T>`, `BTreeMap<K,V>`, `BTreeSet<V>`, `VecDeque<T>`, `LinkedList<T>`, `BinaryHeap<T>`
- Tuples: `(T1,)` … up to 11 elements
- `std` feature: adds support for `std::borrow::Cow<'_, T>`
- `solana` feature: `Pubkey`, `Signature`, `Hash`, messages (legacy/v0), and related v2/v3 types
Note: `HashMap`/`HashSet` are not implemented.
## Cargo features
- `default`: core + `no_std` (uses `alloc`)
- `std`: enables `std` adapters and `Cow`
- `solana`: Solana SDK v2 + Agave v3 types (implies `std`)
## Big‑endian and portability
- Varints are decoded efficiently on little‑endian and portably on big‑endian
- CI runs tests on `s390x-unknown-linux-gnu` using `cross`
- `Pack` always uses a stable little‑endian layout
## Benchmarks
```bash
# Full suite
cargo bench --all-features
# Compare against borsh/bincode
cargo bench --bench roundup --features std
# Diff encoder (RLE vs XOR+zstd strategies)
cargo bench --bench diff_bench --features std
# Solana‑specific
cargo bench --bench solana_bench --features solana
```
## Errors
Errors use `lencode::io::Error` and map to `std::io::Error` under `std`.
```rust
use lencode::prelude::*;
use lencode::io::Error;
let mut buf = Vec::new();
match encode(&123u64, &mut buf) {
Ok(n) => eprintln!("wrote {n} bytes"),
Err(Error::WriterOutOfSpace) => eprintln!("buffer too small"),
Err(Error::ReaderOutOfData) => eprintln!("unexpected EOF"),
Err(Error::InvalidData) => eprintln!("corrupted data"),
Err(e) => eprintln!("other error: {e}"),
}
```
## Examples
- `examples/size_comparison.rs`: space savings on repeated Solana pubkeys
- `examples/versioned_tx_compression.rs`: end‑to‑end on Solana versioned transactions
Run with `--features solana`.
## License
MIT