Expand description
This module contains a serde deserializer. It can do most of the things you would expect of a typical serde deserializer, such as deserializing into:
- Rust structs.
- containers like
HashMap
andVec
. - an arbitrary
Value
. - enums. For NBT typically you want either internally or untagged enums.
This deserializer supports from_bytes
for zero-copy
deserialization for types like &[u8]
and
borrow::LongArray
. There is also
from_reader
for deserializing from types
implementing Read
.
Avoiding allocations
When using from_bytes
, we can avoid allocations for
things like strings and vectors, instead deserializing into a reference to
the input data.
The following table summarises what types you likely want to store NBT data in for owned or borrowed types:
NBT type | Owned type | Borrowed type |
---|---|---|
Byte | u8 or i8 | use owned |
Short | u16 or i16 | use owned |
Int | i32 or u32 | use owned |
Long | i64 or u64 | use owned |
Float | f32 | use owned |
Double | f64 | use owned |
String | String | Cow<'a, str> or &[u8] (see below) |
List | Vec<T> | use owned |
Byte Array | ByteArray | borrow::ByteArray |
Int Array | IntArray | borrow::IntArray |
Long Array | LongArray | borrow::LongArray |
Primitives
Borrowing for primitive types like the integers and floats is generally not possible due to alignment requirements of those types. It likely wouldn’t be faster/smaller anyway.
Strings
For strings, we cannot know ahead of time whether the data can be borrowed
as &str
. This is because Minecraft uses Java’s encoding of Unicode, not
UTF-8. If the string contains Unicode characters outside of the Basic
Multilingual Plane then we need to convert it to UTF-8, requiring us to own
the string data.
Using Cow<'a, str>
lets us borrow when possible, but
produce an owned value when the representation is different.
Strings can also be deserialized to &[u8]
which will always succeed. These
bytes will be Java’s CESU-8 format. You can use cesu8::from_java_cesu8
to decode this.
Representation of NBT arrays
In order for Value
to preserve all NBT information, the
deserializer “maps into serde’s data
model”. As a
consequence of this, NBT array types must be (de)serialized using the
types provided in this crate, eg LongArray. Sequence
containers like Vec
will (de)serialize to NBT Lists, and will fail if an
NBT array is instead expected.
128 bit integers and UUIDs
UUIDs tend to be stored in NBT using 4-long IntArrays. When deserializing
i128
or u128
, IntArray with length 4 are accepted. This is parsed as big
endian i.e. the most significant bit (and int) is first.
Other quirks
Some other quirks which may not be obvious:
- When deserializing to unsigned types such as u32, it will be an error if a
value is negative to avoid unexpected behaviour with wrap-around. This
does not apply to deserializing lists of integrals to
u8
slice or vectors. - Any integral value from NBT can be deserialized to bool. Any non-zero
value becomes
true
. Bear in mind serializing the same type will change the NBT structure, likely unintended. - You can deserialize a field to the unit type
()
or unit struct. This ignores the value but ensures that it existed. - You cannot deserialize into anything other than a
struct
or similar container egHashMap
. This is due to a misalignment between the NBT format and Rust’s types. Attempting to will give an error about no root compound. This means you can never dolet s: String = from_bytes(...)
. Serialization of a struct assumes an empty-named compound.
Example Minecraft types
This section demonstrates writing types for a few real Minecraft structures.
Extracting entities as an enum
This demonstrates the type that you would need to write in order to extract
some subset of entities. This uses a tagged enum in serde, meaning that it
will look for a certain field in the structure to tell it what enum variant
to deserialize into. We use serde’s other
attribute to not error when an
unknown entity type is found.
use serde::Deserialize;
#[derive(Deserialize, Debug)]
#[serde(tag = "id")]
enum Entity {
#[serde(rename = "minecraft:bat")]
Bat {
#[serde(rename = "BatFlags")]
bat_flags: i8,
},
#[serde(rename = "minecraft:creeper")]
Creeper { ignited: i8 },
// Entities we haven't coded end up as just 'unknown'.
#[serde(other)]
Unknown,
}
Capture unknown entities
If you need to capture all entity types, but do not wish to manually type all of them, you can wrap the above entity type in an untagged enum.
use serde::Deserialize;
use fastnbt::Value;
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Entity {
Known(KnownEntity),
Unknown(Value),
}
#[derive(Deserialize, Debug)]
#[serde(tag = "id")]
enum KnownEntity {
#[serde(rename = "minecraft:bat")]
Bat {
#[serde(rename = "BatFlags")]
bat_flags: i8,
},
#[serde(rename = "minecraft:creeper")]
Creeper { ignited: i8 },
}
Avoiding allocations in a Chunk
This example shows how to avoid some allocations. The Section
type below
contains the block states which stores the state of part of the Minecraft
world. In NBT this is bit-packed data stored as an array of
longs (i64). We avoid allocating a vector for this by storing it as a
borrow::LongArray
instead, which stores it
as &[u8]
under the hood. We can’t safely store it as &[i64]
due to memory
alignment constraints. The fastanvil
crate has a PackedBits
type that can
handle the unpacking of these block states.
use fastnbt::borrow::LongArray;
#[derive(Deserialize)]
struct Chunk<'a> {
#[serde(rename = "Level")]
#[serde(borrow)]
level: Level<'a>,
}
#[derive(Deserialize)]
struct Level<'a> {
#[serde(rename = "Sections")]
#[serde(borrow)]
pub sections: Option<Vec<Section<'a>>>,
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "PascalCase")]
pub struct Section<'a> {
#[serde(borrow)]
pub block_states: Option<LongArray<'a>>,
}
Unit variant enum from status of chunk
use serde::Deserialize;
#[derive(Deserialize)]
struct Chunk {
#[serde(rename = "Level")]
level: Level,
}
#[derive(Deserialize)]
struct Level {
#[serde(rename = "Status")]
status: Status,
}
#[derive(Deserialize, PartialEq, Debug)]
#[serde(rename_all = "snake_case")]
enum Status {
Empty,
StructureStarts,
StructureReferences,
Biomes,
Noise,
Surface,
Carvers,
LiquidCarvers,
Features,
Light,
Spawn,
Heightmaps,
Full,
}
Structs
- Deserializer for NBT data. See the
de
module for more information.