Expand description
This module contains a serde deserializer. It can do most of the things you would expect of a typical serde deserializer, such as deserializing into:
- Rust structs.
- containers like
HashMap
andVec
. - an arbitrary
Value
. - enums. For NBT typically you want either internally or untagged enums.
This deserializer only supports from_bytes
. This is
usually fine as most structures stored in this format are reasonably small,
the largest likely being an individual Chunk which maxes out at 1 MiB
compressed. This enables zero-copy deserialization in places.
Avoiding allocations
Due to having all the input in memory, we can avoid allocations for things like strings and vectors, instead deserializing into a reference to the input data.
The following table summarises what types you likely want to store NBT data in for owned or borrowed types:
NBT type | Owned type | Borrowed type |
---|---|---|
Byte | u8 or i8 | use owned |
Short | u16 or i16 | use owned |
Int | i32 or u32 | use owned |
Long | i64 or u64 | use owned |
Float | f32 | use owned |
Double | f64 | use owned |
String | String | Cow<'a, str> (see below) |
List | Vec<T> | use owned |
Byte Array | ByteArray | borrow::ByteArray |
Int Array | IntArray | borrow::IntArray |
Long Array | LongArray | borrow::LongArray |
Primitives
Borrowing for primitive types like the integers and floats is generally not possible due to alignment requirements of those types. It likely wouldn’t be faster/smaller anyway.
Strings
For strings, we cannot know ahead of time whether the data can be borrowed
as &str
. This is because Minecraft uses Java’s encoding of Unicode, which
is not UTF-8 like Rust. If the string contains Unicode characters outside of
the Basic Multilingual Plane then we need to convert it to utf-8, requiring
us to own the string data.
Using Cow<'a, str>
lets us borrow when possible, but
produce an owned value when the representation is different. This will be
common for minecrafts internal strings and any world whose language falls in
the basic multilingual plane.
In future we could support a lazy string type that always borrows the underyling data and decodes when needed. Please open an issue if this is important to you.
Representation of NBT arrays
In order for Value
to preserve all NBT information, the
deserializer “maps into serde’s data
model”. This
means that in order to deserialize NBT array types, you must use the types
provided in this crate, eg LongArray.
Other quirks
Some other quirks which may not be obvious:
- When deserializing to unsigned types such as u32, it will be an error if a
value is negative to avoid unexpected behaviour with wrap-around. This
does not apply to deserializing lists of integrals to
u8
slice or vectors. - Any integral value from NBT can be deserialized to bool. Any non-zero
value becomes
true
. - You can deserialize a field to the unit type
()
or unit struct. This ignores the value but ensures that it existed. - You cannot deserialize into anything other than a
struct
or similar container egHashMap
. This is due to a misalignment between the NBT format and Rust’s types. Attempting to will give aNoRootCompound
error. This means you can never dolet s: String = from_bytes(...)
.
Example Minecraft types
This section demonstrates writing types for a few real Minecraft structures.
Extracting entities as an enum
This demonstrates the type that you would need to write in order to extract
some subset of entities. This uses a tagged enum in serde, meaning that it
will look for a certain field in the structure to tell it what enum variant
to deserialize into. We use serde’s other
attribute to not error when an
unknown entity type is found.
use serde::Deserialize;
#[derive(Deserialize, Debug)]
#[serde(tag = "id")]
enum Entity {
#[serde(rename = "minecraft:bat")]
Bat {
#[serde(rename = "BatFlags")]
bat_flags: i8,
},
#[serde(rename = "minecraft:creeper")]
Creeper { ignited: i8 },
// Entities we haven't coded end up as just 'unknown'.
#[serde(other)]
Unknown,
}
Capture unknown entities
If you need to capture all entity types, but do not wish to manually type all of them, you can wrap the above entity type in an untagged enum.
use serde::Deserialize;
use fastnbt::Value;
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Entity {
Known(KnownEntity),
Unknown(Value),
}
#[derive(Deserialize, Debug)]
#[serde(tag = "id")]
enum KnownEntity {
#[serde(rename = "minecraft:bat")]
Bat {
#[serde(rename = "BatFlags")]
bat_flags: i8,
},
#[serde(rename = "minecraft:creeper")]
Creeper { ignited: i8 },
}
Avoiding allocations in a Chunk
This example shows how to avoid some allocations. The Section
type below
contains the block states which stores the state of part of the Minecraft
world. In NBT this is a complicated backed bits type stored as an array of
longs (i64). We avoid allocating a vector for this by storing it as a
&[u8]
instead. We can’t safely store it as &[i64]
due to memory
alignment constraints. The fastanvil
crate has a PackedBits
type that
can handle the unpacking of these block states.
use fastnbt::borrow::LongArray;
#[derive(Deserialize)]
struct Chunk<'a> {
#[serde(rename = "Level")]
#[serde(borrow)]
level: Level<'a>,
}
#[derive(Deserialize)]
struct Level<'a> {
#[serde(rename = "Sections")]
#[serde(borrow)]
pub sections: Option<Vec<Section<'a>>>,
}
#[derive(Deserialize, Debug)]
#[serde(rename_all = "PascalCase")]
pub struct Section<'a> {
#[serde(borrow)]
pub block_states: Option<LongArray<'a>>,
}
Unit variant enum from status of chunk
use serde::Deserialize;
#[derive(Deserialize)]
struct Chunk {
#[serde(rename = "Level")]
level: Level,
}
#[derive(Deserialize)]
struct Level {
#[serde(rename = "Status")]
status: Status,
}
#[derive(Deserialize, PartialEq, Debug)]
#[serde(rename_all = "snake_case")]
enum Status {
Empty,
StructureStarts,
StructureReferences,
Biomes,
Noise,
Surface,
Carvers,
LiquidCarvers,
Features,
Light,
Spawn,
Heightmaps,
Full,
}
Structs
Deserializer for NBT data. See the de
module for more information.