Module fastnbt::de

source · []
Expand description

This module contains a serde deserializer. It can do most of the things you would expect of a typical serde deserializer, such as deserializing into:

  • Rust structs.
  • containers like HashMap and Vec.
  • an arbitrary Value.
  • enums. For NBT typically you want either internally or untagged enums.

This deserializer only supports from_bytes. This is usually fine as most structures stored in this format are reasonably small, the largest likely being an individual Chunk which maxes out at 1 MiB compressed. This enables zero-copy deserialization in places.

Avoiding allocations

Due to having all the input in memory, we can avoid allocations for things like strings and vectors, instead deserializing into a reference to the input data.

The following table summarises what types you likely want to store NBT data in for owned or borrowed types:

NBT typeOwned typeBorrowed type
Byteu8 or i8use owned
Shortu16 or i16use owned
Inti32 or u32use owned
Longi64 or u64use owned
Floatf32use owned
Doublef64use owned
StringStringCow<'a, str> (see below)
ListVec<T>use owned
Byte ArrayByteArrayborrow::ByteArray
Int ArrayIntArrayborrow::IntArray
Long ArrayLongArrayborrow::LongArray

Primitives

Borrowing for primitive types like the integers and floats is generally not possible due to alignment requirements of those types. It likely wouldn’t be faster/smaller anyway.

Strings

For strings, we cannot know ahead of time whether the data can be borrowed as &str. This is because Minecraft uses Java’s encoding of Unicode, which is not UTF-8 like Rust. If the string contains Unicode characters outside of the Basic Multilingual Plane then we need to convert it to utf-8, requiring us to own the string data.

Using Cow<'a, str> lets us borrow when possible, but produce an owned value when the representation is different. This will be common for minecrafts internal strings and any world whose language falls in the basic multilingual plane.

In future we could support a lazy string type that always borrows the underyling data and decodes when needed. Please open an issue if this is important to you.

Representation of NBT arrays

In order for Value to preserve all NBT information, the deserializer “maps into serde’s data model”. This means that in order to deserialize NBT array types, you must use the types provided in this crate, eg LongArray.

Other quirks

Some other quirks which may not be obvious:

  • When deserializing to unsigned types such as u32, it will be an error if a value is negative to avoid unexpected behaviour with wrap-around. This does not apply to deserializing lists of integrals to u8 slice or vectors.
  • Any integral value from NBT can be deserialized to bool. Any non-zero value becomes true.
  • You can deserialize a field to the unit type () or unit struct. This ignores the value but ensures that it existed.
  • You cannot deserialize into anything other than a struct or similar container eg HashMap. This is due to a misalignment between the NBT format and Rust’s types. Attempting to will give a NoRootCompound error. This means you can never do let s: String = from_bytes(...).

Example Minecraft types

This section demonstrates writing types for a few real Minecraft structures.

Extracting entities as an enum

This demonstrates the type that you would need to write in order to extract some subset of entities. This uses a tagged enum in serde, meaning that it will look for a certain field in the structure to tell it what enum variant to deserialize into. We use serde’s other attribute to not error when an unknown entity type is found.

use serde::Deserialize;

#[derive(Deserialize, Debug)]
#[serde(tag = "id")]
enum Entity {
   #[serde(rename = "minecraft:bat")]
   Bat {
       #[serde(rename = "BatFlags")]
       bat_flags: i8,
   },

   #[serde(rename = "minecraft:creeper")]
   Creeper { ignited: i8 },

   // Entities we haven't coded end up as just 'unknown'.
   #[serde(other)]
   Unknown,
}

Capture unknown entities

If you need to capture all entity types, but do not wish to manually type all of them, you can wrap the above entity type in an untagged enum.

use serde::Deserialize;
use fastnbt::Value;

#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Entity {
    Known(KnownEntity),
    Unknown(Value),
}
#[derive(Deserialize, Debug)]
#[serde(tag = "id")]
enum KnownEntity {
    #[serde(rename = "minecraft:bat")]
    Bat {
        #[serde(rename = "BatFlags")]
        bat_flags: i8,
    },
    #[serde(rename = "minecraft:creeper")]
    Creeper { ignited: i8 },
}

Avoiding allocations in a Chunk

This example shows how to avoid some allocations. The Section type below contains the block states which stores the state of part of the Minecraft world. In NBT this is a complicated backed bits type stored as an array of longs (i64). We avoid allocating a vector for this by storing it as a &[u8] instead. We can’t safely store it as &[i64] due to memory alignment constraints. The fastanvil crate has a PackedBits type that can handle the unpacking of these block states.

use fastnbt::borrow::LongArray;

#[derive(Deserialize)]
struct Chunk<'a> {
    #[serde(rename = "Level")]
    #[serde(borrow)]
    level: Level<'a>,
}

#[derive(Deserialize)]
struct Level<'a> {
    #[serde(rename = "Sections")]
    #[serde(borrow)]
    pub sections: Option<Vec<Section<'a>>>,
}

#[derive(Deserialize, Debug)]
#[serde(rename_all = "PascalCase")]
pub struct Section<'a> {
    #[serde(borrow)]
    pub block_states: Option<LongArray<'a>>,
}

Unit variant enum from status of chunk

use serde::Deserialize;

#[derive(Deserialize)]
struct Chunk {
    #[serde(rename = "Level")]
    level: Level,
}

#[derive(Deserialize)]
struct Level {
    #[serde(rename = "Status")]
    status: Status,
}

#[derive(Deserialize, PartialEq, Debug)]
#[serde(rename_all = "snake_case")]
enum Status {
    Empty,
    StructureStarts,
    StructureReferences,
    Biomes,
    Noise,
    Surface,
    Carvers,
    LiquidCarvers,
    Features,
    Light,
    Spawn,
    Heightmaps,
    Full,
}

Structs

Deserializer for NBT data. See the de module for more information.