Expand description
This module enables RLP encoding of high-level objects.
RLP (recursive length prefix) is a common algorithm for encoding of variable length binary data. RLP encodes data before storing on disk or transmitting via network.
§Theory
Encoding
Primary RLP can only deal with “item” type, which is defined as:
Some examples are:
b'\x00\xff'
- empty list
vec![]
- list of bytes
vec![vec![0u8], vec![1u8, 3u8]]
- list of combinations
vec![vec![], vec![0u8], vec![vec![0]]]
The encoded result is always a byte string (sequence of u8
).
Encoding algorithm
Given x
item as input, we define rlp_encode
as the following algorithm:
Let concat
be a function that joins given bytes into single byte sequence.
- If
x
is a single byte and0x00 <= x <= 0x7F
,rlp_encode(x) = x
. - Otherwise, if
x
is a byte string, letlen(x)
be length ofx
in bytes and define encoding as follows:- If
0 < len(x) < 0x38
(note that empty byte string fulfills this requirement), thenIn this case first byte is in rangerlp_encode(x) = concat(0x80 + len(x), x)
[0x80; 0xB7]
. - If
0x38 <= len(x) <= 0xFFFFFFFF
, thenIn this case first byte is in rangerlp_encode(x) = concat(0xB7 + len(len(x)), len(x), x)
[0xB8; 0xBF]
. - For longer strings encoding is undefined.
- If
- Otherwise, if
x
is a list, lets = concat(map(rlp_encode, x))
be concatenation of RLP encodings of all its items.- If
0 < len(s) < 0x38
(note that empty list matches), thenIn this case first byte is in rangerlp_encode(x) = concat(0xC0 + len(s), s)
[0xC0; 0xF7]
. - If
0x38 <= len(s) <= 0xFFFFFFFF
, thenIn this case first byte is in rangerlp_encode(x) = concat(0xF7 + len(len(s)), len(s), x)
[0xF8; 0xFF]
. - For longer lists encoding is undefined.
- If
See more in Ethereum wiki.
Encoding examples
x | rlp_encode(x) |
---|---|
b'' | 0x80 |
b'\x00' | 0x00 |
b'\x0F' | 0x0F |
b'\x79' | 0x79 |
b'\x80' | 0x81 0x80 |
b'\xFF' | 0x81 0xFF |
b'foo' | 0x83 0x66 0x6F 0x6F |
[] | 0xC0 |
[b'\x0F'] | 0xC1 0x0F |
[b'\xEF'] | 0xC1 0x81 0xEF |
[[], [[]]] | 0xC3 0xC0 0xC1 0xC0 |
Serialization
However, in the real world, the inputs are not pure bytes nor lists.
We need a way to encode numbers (like u64
), custom structs, enums and other
more complex machinery that exists in the surrounding code.
This library wraps fastrlp
crate, so everything mentioned there about Encodable
and Decodable
traits still
applies. You can implement those for any object to make it RLP-serializable.
However, following this approach directly results in cluttered code: your struct
s
now have to use field types that match serialization, which may be very inconvenient.
To avoid this pitfall, this RLP implementation allows “extended” struct definition
via a macro. Let’s have a look at Transaction
definition:
use thor_devkit::rlp::{AsBytes, AsVec, Maybe, Bytes};
use thor_devkit::{rlp_encodable, U256};
use thor_devkit::transactions::{Clause, Reserved};
rlp_encodable! {
/// Represents a single VeChain transaction.
#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Transaction {
/// Chain tag
pub chain_tag: u8,
pub block_ref: u64,
pub expiration: u32,
pub clauses: Vec<Clause>,
pub gas_price_coef: u8,
pub gas: u64,
pub depends_on: Option<U256> => AsBytes<U256>,
pub nonce: u64,
pub reserved: Option<Reserved> => AsVec<Reserved>,
pub signature: Option<Bytes> => Maybe<Bytes>,
}
}
What’s going on here? First, some fields are encoded “as usual”: unsigned integers
are encoded just fine and you likely won’t need any different encoding. However,
some fields work in a different way. depends_on
is a number that may be present
or absent, and it should be encoded as a byte sting. U256
is already encoded this
way, but None
is not (Option
is not RLP-serializable on itself). So we wrap it
in a special wrapper: AsBytes
. AsBytes<T>
will serialize Some(T)
as T
and
None
as an empty byte string.
reserved
is a truly special struct that has custom encoding implemented for it.
That implementation serializes Reserved
into a Vec<Bytes>
, and then serializes
this Vec<Bytes>
to the output stream. If it is empty, an empty vector should be
written instead. This is achieved via AsVec
annotation.
Maybe
is a third special wrapper. Fields annotated with Maybe
may only be placed
last (otherwise encoding is ambiguous), and with Maybe<T>
Some(T)
is serialized
as T
and None
— as nothing (zero bytes added).
Fields comments are omitted here for brevity, they are preserved as well.
This macro adds both decoding and encoding capabilities. See examples folder for more examples of usage, including custom types and machinery.
Note that this syntax is not restricted to these three wrappers, you can use
any types with proper From
implementation:
use thor_devkit::rlp_encodable;
#[derive(Clone)]
struct MySeries {
left: [u8; 2],
right: [u8; 2],
}
impl From<MySeries> for u32 {
fn from(value: MySeries) -> Self {
Self::from_be_bytes(value.left.into_iter().chain(value.right).collect::<Vec<_>>().try_into().unwrap())
}
}
impl From<u32> for MySeries {
fn from(value: u32) -> Self {
let [a, b, c, d] = value.to_be_bytes();
Self{ left: [a, b], right: [c, d] }
}
}
rlp_encodable! {
pub struct Foo {
pub foo: MySeries => u32,
}
}
Structs§
- Bytes
- A cheaply cloneable and sliceable chunk of contiguous memory.
- Bytes
Mut - A unique reference to a contiguous slice of memory.
- Header
Enums§
- AsBytes
- Serialization wrapper for
Option
to serializeNone
as emptyBytes
. - AsVec
- Serialization wrapper for
Option
to serializeNone
as emptyVec
. - Maybe
- Serialization wrapper for
Option
to serializeNone
as nothing (do not modify output stream). This must be the last field in the struct. - RLPError
Traits§
- Buf
- Read bytes from a buffer.
- BufMut
- A trait for values that provide sequential write access to bytes.
- Decodable
- Encodable
Type Aliases§
- RLPResult
- Convenience alias for a result of fallible RLP decoding.