Crate savefile [] [src]

This is the documentation for savefile

Introduction

Savefile is a rust library to conveniently, quickly and correctly serialize and deserialize arbitrary rust structs and enums into an efficient and compact binary version controlled format.

The design use case is any application that needs to save large amounts of data to disk, and support loading files from previous versions of that application (but not from later versions!).

Example

Here is a small example where data about a player in a hypothetical computer game is saved to disk using Savefile.

extern crate savefile;
use savefile::prelude::*;

#[macro_use]
extern crate savefile_derive;


#[derive(Savefile)]
struct Player {
    name : String,
    strength : u32,
    inventory : Vec<String>,
}

fn save_player(player:&Player) {
    save_file("save.bin", 0, player).unwrap();
}

fn load_player() -> Player {
    load_file("save.bin", 0).unwrap()
}

fn main() {
    let player = Player { name: "Steve".to_string(), strength: 42,
        inventory: vec!(
            "wallet".to_string(),
            "car keys".to_string(),
            "glasses".to_string())};

    save_player(&player);

    let reloaded_player = load_player();

    assert_eq!(reloaded_player.name,"Steve".to_string());
}

Handling old versions

Let's expand the above example, by creating a 2nd version of the Player struct. Let's say you decide that your game mechanics don't really need to track the strength of the player, but you do wish to have a set of skills per player as well as the inventory.

Mark the struct like so:

extern crate savefile;
use savefile::prelude::*;

#[macro_use]
extern crate savefile_derive;

const GLOBAL_VERSION:u32 = 1;
#[derive(Savefile)]
struct Player {
    name : String,
    #[versions="0..0"] //Only version 0 had this field
    strength : Removed<u32>,
    inventory : Vec<String>,
    #[versions="1.."] //Only versions 1 and later have this field
    skills : Vec<String>,
}

fn save_player(file:&'static str, player:&Player) {
    // Save current version of file.
    save_file(file, GLOBAL_VERSION, player).unwrap();
}

fn load_player(file:&'static str) -> Player {
    // The GLOBAL_VERSION means we have that version of our data structures,
    // but we can still load any older version.
    load_file(file, GLOBAL_VERSION).unwrap()
}

fn main() {
    let mut player = load_player("save.bin"); //Load from previous save
    assert_eq!("Steve",&player.name); //The name from the previous version saved will remain
    assert_eq!(0,player.skills.len()); //Skills didn't exist when this was saved
    player.skills.push("Whistling".to_string());	
    save_player("newsave.bin", &player); //The version saved here will the vec of skills
}

Behind the scenes

For Savefile to be able to load and save a type T, that type must implement traits [savefile::WithSchema], [savefile::Serialize] and [savefile::Deserialize] . The custom derive macro Savefile derives all of these.

You can also implement these traits manually. Manual implementation can be good for:

1: Complex types for which the Savefile custom derive function does not work. For example, trait objects or objects containing pointers.

2: Objects for which not all fields should be serialized, or which need complex initialization (like running arbitrary code during deserialization).

Note that the three trait implementations for a particular type must be in sync. That is, the Serialize and Deserialize traits must follow the schema defined by the WithSchema trait for the type.

WithSchema

The [savefile::WithSchema] trait represents a type which knows which data layout it will have when saved.

Serialize

The [savefile::Serialize] trait represents a type which knows how to write instances of itself to a Serializer.

Deserialize

The [savefile::Deserialize] trait represents a type which knows how to read instances of itself from a Deserializer.

Rules for managing versions

The basic rule is that the Deserialize trait implementation must be able to deserialize data from any previous version.

The WithSchema trait implementation must be able to return the schema for any previous verison.

The Serialize trait implementation only needs to support the latest version.

Versions and derive

The derive macro used by Savefile supports multiple versions of structs. To make this work, you have to add attributes whenever fields are removed, added or have their types changed.

When adding or removing fields, use the #[versions] attribute.

The syntax is one of the following:

#[versions = "N.."]  //A field added in version N
#[versions = "..N"]  //A field removed in version N+1. That is, it existed up to and including version N.
#[versions = "N..M"] //A field that was added in version N and removed in M+1. That is, a field which existed in versions N .. up to and including M.

Removed fields must keep their deserialization type. This is easiest accomplished by substituting their previous type using the Removed<T> type. Removed<T> uses zero space in RAM, but deserializes equivalently to T (with the result of the deserialization thrown away).

Savefile tries to validate that the Removed<T> type is used correctly. This validation is based on string matching, so it may trigger false positives for other types named Removed. Please avoid using a type with such a name. If this becomes a problem, please file an issue on github.

Using the #[versions] tag is critically important. If this is messed up, data corruption is likely.

When a field is added, its type must implement the Default trait (unless the default_val or default_fn attributes are used).

There also exists a default_val, a default_fn and a versions_as field. More about these below:

The versions attribute

Rules for using the #[versions] attribute:

  • You must keep track of what the current version of your data is. Let's call this version N.
  • You may only save data using version N (supply this number when calling save)
  • When data is loaded, you must supply version N as the memory-version number to load. Load will still adapt the deserialization operation to the version of the serialized data.
  • The version number N is "global". All components of the saved data must have the same version.
  • Whenever changes to the data are to be made, the global version number N must be increased.
  • You may add a new field to your structs, iff you also give it a #[versions = "N.."] attribute. N must be the new version of your data.
  • You may remove a field from your structs. If previously it had no #[versions] attribute, you must add a #[versions = "..N-1"] attribute. If it already had an attribute #[versions = "M.."], you must close its version interval using the current version of your data: #[versions = "M..N-1"]. Whenever a field is removed, its type must simply be changed to Removed where T is its previous type. You may never completely remove items from your structs. Doing so removes backward-compatibility with that version. This will be detected at load. For example, if you remove a field in version 3, you should add a #[versions="..2"] attribute.
  • You may not change the type of a field in your structs, except when using the versions_as-macro.

The default_val attribute

The default_val attribute is used to provide a custom default value for primitive types, when fields are added.

Example:


#[derive(Savefile)]
struct SomeType {
    old_field: u32,
    #[default_val="42"]
    #[versions="1.."]
    new_field: u32
}

In the above example, the field new_field will have the value 42 when deserializing from version 0 of the protocol. If the default_val attribute is not used, new_field will have u32::default() instead, which is 0.

The default_val attribute only works for simple types.

The default_fn attribute

The default_fn attribute allows constructing more complex values as defaults.


fn make_hello_pair() -> (String,String) {
    ("Hello".to_string(),"World".to_string())
}
#[derive(Savefile)]
struct SomeType {
    old_field: u32,
    #[default_fn="make_hello_pair"]
    #[versions="1.."]
    new_field: (String,String)
}

The versions_as attribute

The versions_as attribute can be used to support changing the type of a field.

Let's say the first version of our protocol uses the following struct:


#[derive(Savefile)]
struct Employee {
    name : String,
    phone_number : u64
}

After a while, we realize that a u64 is a really bad choice for datatype for a phone number, since it can't represent a number with leading 0, and also can't represent special characters which sometimes appear in phone numbers, like '+' or '-' etc.

So, we change the type of phone_number to String:


fn convert(phone_number:u64) -> String {
    phone_number.to_string()
}
#[derive(Savefile)]
struct Employee {
    name : String,
    #[versions_as="0..0:convert:u64"]
    #[versions="1.."]
    phone_number : String
}

This will cause version 0 of the protocol to be deserialized expecting a u64 for the phone number, which will then be converted using the provided function convert into a String.

Note, that conversions which are supported by the From trait are done automatically, and the function need not be specified in these cases.

Let's say we have the following struct:


#[derive(Savefile)]
struct Racecar {
    max_speed_kmh : u8,
}

We realize that we need to increase the range of the max_speed_kmh variable, and change it like this:


#[derive(Savefile)]
struct Racecar {
    #[versions_as="0..0:u8"]
    #[versions="1.."]
    max_speed_kmh : u16,
}

Note that in this case we don't need to tell Savefile how the deserialized u8 is to be converted to an u16.

Speeding things up

Now, let's say we want to add a list of all positions that our player have visited, so that we can provide a instant-replay function to our game. The list can become really long, so we want to make sure that the overhead when serializing this is as low as possible.

Savefile has an unsafe trait [savefile::ReprC] that you can implement for a type T. This instructs Savefile to optimize serialization of Vec into being a very fast, raw memory copy.

This is dangerous. You, as implementor of the ReprR trait take full responsibility that all the following rules are upheld:

  • The type T is Copy
  • The type T is a struct or an enum without fields. Using it on enums with fields will probably lead to silent data corruption.
  • The type is represented in memory in an ordered, packed representation. Savefile is not clever enough to inspect the actual memory layout and adapt to this, so the memory representation has to be all the types of the struct fields in a consecutive sequence without any gaps. Note that the #[repr(C)] trait does not do this - it will include padding if needed for alignment reasons. You should not use #[repr(packed)], since that may lead to unaligned struct fields. Instead, you should use #[repr(C)] combined with manual padding, if necessary.
  • If the type is an enum, it must be #[repr(u8)] .

For example, don't do:

#[repr(C)]
struct Bad {
    f1 : u8,
    f2 : u32,
}

Since the compiler is likely to insert 3 bytes of padding after f1, to ensure that f2 is aligned to 4 bytes.

Instead, do this:

#[repr(C)]
struct Good {
    f1 : u8,
    pad1 :u8,
    pad2 :u8,
    pad3 :u8,
    f2 : u32,
}

And simpy don't use the pad1, pad2 and pad3 fields. Note, at time of writing, Savefile requires that the struct be free of all padding. Even padding at the end is not allowed. This means that the following does not work:

#[repr(C)]
struct Bad2 {
    f1 : u32,
    f2 : u8,
}

This restriction may be lifted at a later time.

Note that having a struct with bad alignment will be detected, at runtime, for debug-builds. It may not be detected in release builds. Serializing or deserializing each [savefile::ReprC] struct at least once somewhere in your test suite is recommended.

extern crate savefile;
use savefile::prelude::*;

#[macro_use]
extern crate savefile_derive;

#[derive(ReprC, Clone, Copy, Savefile)]
#[repr(C)]
struct Position {
    x : u32,
    y : u32,
}

const GLOBAL_VERSION:u32 = 2;
#[derive(Savefile)]
struct Player {
    name : String,
    #[versions="0..0"] //Only version 0 had this field
    strength : Removed<u32>,
    inventory : Vec<String>,
    #[versions="1.."] //Only versions 1 and later have this field
    skills : Vec<String>,
    #[versions="2.."] //Only versions 2 and later have this field
    history : Vec<Position>
}

fn save_player(file:&'static str, player:&Player) {
    save_file(file, GLOBAL_VERSION, player).unwrap();
}

fn load_player(file:&'static str) -> Player {
    load_file(file, GLOBAL_VERSION).unwrap()
}

fn main() {
    let mut player = load_player("newsave.bin"); //Load from previous save
    player.history.push(Position{x:1,y:1});
    player.history.push(Position{x:2,y:1});
    player.history.push(Position{x:2,y:2});
    save_player("newersave.bin", &player);
}

Modules

prelude

Structs

Deserializer

Object from which bytes to be deserialized are read. This is basically just a wrapped std::io::Read object, the version number of the file being read, and the current version number of the data structures in memory.

Field

A field is serialized according to its value. The name is just for diagnostics.

Removed
SchemaEnum

An enum is serialized as its u8 variant discriminator followed by all the field for that variant. The name of each variant, as well as its order in the enum (the discriminator), is significant.

SchemaStruct

A struct is by serializing its fields one by one, without any padding. The dbg_name is just for diagnostics.

Serializer

Object to which serialized data is to be written. This is basically just a wrapped std::io::Write object and a file protocol version number.

Variant

An enum variant is serialized as its fields, one by one, without any padding.

Enums

SavefileError

This object represents an error in deserializing or serializinga an item.

Schema

The schema represents the save file format of your data.

SchemaPrimitive

A primitive is serialized as the little endian representation of its type, except for string, which is serialized as an usize length followed by the string in utf8.

Traits

Deserialize

This trait must be implemented for all data structures you wish to be able to deserialize.

ReprC

This is a marker trait for types which have an in-memory layout that is packed and therefore identical to the layout that savefile will use on disk. This means that types for which this trait is implemented can be serialized very quickly by just writing their raw bits to disc.

Serialize

This trait must be implemented for all data structures you wish to be able to serialize. To actually serialize data: create a Serializer, then call serialize on your data to save, giving the Serializer as an argument.

WithSchema

This trait must be implemented by all data structures you wish to be able to save. It must encode the schema for the datastructure when saved using the given version number. When files are saved, the schema is encoded into the file. when loading, the schema is inspected to make sure that the load will safely succeed. This is only for increased safety, the file format does not in fact use the schema for any other purpose, the design is schema-less at the core, the schema is just an added layer of safety (which can be disabled).

Functions

load

Deserialize an instance of type T from the given reader . The current type of T in memory must be equal to version. The deserializer will use the actual protocol version in the file to do the deserialization.

load_file

Like [savefile::load] , except it deserializes from the given file in the filesystem. This is a pure convenience function.

load_file_noschema

Like [savefile::load_noschema] , except it deserializes from the given file in the filesystem. This is a pure convenience function.

load_noschema

Like [savefile::load] , but used to open files saved without schema, by one of the _noschema versions of the save functions.

save

Write the given data to the writer. The current version of data must be version.

save_file

Like [savefile::save] , except it opens a file on the filesystem and writes the data to it. This is a pure convenience function.

save_file_noschema

Like [savefile::save_noschema] , except it opens a file on the filesystem and writes the data to it. This is a pure convenience function.

save_noschema

Write the given data to the writer. The current version of data must be version. Do this write without writing any schema to disk. As long as all the serializers and deserializers are correctly written, the schema is not necessary. Omitting the schema saves some space in the saved file, but means that any mistake in implementation of the Serialize or Deserialize traits will cause hard-to-troubleshoot data corruption instead of a nice error message.