[][src]Crate savefile

This is the documentation for savefile

Introduction

Savefile is a rust library to conveniently, quickly and correctly serialize and deserialize arbitrary rust structs and enums into an efficient and compact binary version controlled format.

The design use case is any application that needs to save large amounts of data to disk, and support loading files from previous versions of that application (but not from later versions!).

Example

Here is a small example where data about a player in a hypothetical computer game is saved to disk using Savefile.

extern crate savefile;
use savefile::prelude::*;

#[macro_use]
extern crate savefile_derive;


#[derive(Savefile)]
struct Player {
    name : String,
    strength : u32,
    inventory : Vec<String>,
}

fn save_player(player:&Player) {
    save_file("save.bin", 0, player).unwrap();
}

fn load_player() -> Player {
    load_file("save.bin", 0).unwrap()
}

fn main() {
	let player = Player { name: "Steve".to_string(), strength: 42,
        inventory: vec!(
            "wallet".to_string(),
            "car keys".to_string(),
            "glasses".to_string())};

    save_player(&player);

    let reloaded_player = load_player();

    assert_eq!(reloaded_player.name,"Steve".to_string());
}

Handling old versions

Let's expand the above example, by creating a 2nd version of the Player struct. Let's say you decide that your game mechanics don't really need to track the strength of the player, but you do wish to have a set of skills per player as well as the inventory.

Mark the struct like so:

extern crate savefile;
use savefile::prelude::*;

#[macro_use]
extern crate savefile_derive;

const GLOBAL_VERSION:u32 = 1;
#[derive(Savefile)]
struct Player {
    name : String,
    #[savefile_versions="0..0"] //Only version 0 had this field
    strength : Removed<u32>,
    inventory : Vec<String>,
    #[savefile_versions="1.."] //Only versions 1 and later have this field
    skills : Vec<String>,
}

fn save_player(file:&'static str, player:&Player) {
	// Save current version of file.
    save_file(file, GLOBAL_VERSION, player).unwrap();
}

fn load_player(file:&'static str) -> Player {
	// The GLOBAL_VERSION means we have that version of our data structures,
	// but we can still load any older version.
    load_file(file, GLOBAL_VERSION).unwrap()
}

fn main() {
	let mut player = load_player("save.bin"); //Load from previous save
	assert_eq!("Steve",&player.name); //The name from the previous version saved will remain
	assert_eq!(0,player.skills.len()); //Skills didn't exist when this was saved
	player.skills.push("Whistling".to_string());	
	save_player("newsave.bin", &player); //The version saved here will the vec of skills
}

Behind the scenes

For Savefile to be able to load and save a type T, that type must implement traits crate::WithSchema, crate::Serialize and crate::Deserialize . The custom derive macro Savefile derives all of these.

You can also implement these traits manually. Manual implementation can be good for:

1: Complex types for which the Savefile custom derive function does not work. For example, trait objects or objects containing pointers.

2: Objects for which not all fields should be serialized, or which need complex initialization (like running arbitrary code during deserialization).

Note that the three trait implementations for a particular type must be in sync. That is, the Serialize and Deserialize traits must follow the schema defined by the WithSchema trait for the type.

WithSchema

The crate::WithSchema trait represents a type which knows which data layout it will have when saved.

Serialize

The crate::Serialize trait represents a type which knows how to write instances of itself to a Serializer.

Deserialize

The crate::Deserialize trait represents a type which knows how to read instances of itself from a Deserializer.

Rules for managing versions

The basic rule is that the Deserialize trait implementation must be able to deserialize data from any previous version.

The WithSchema trait implementation must be able to return the schema for any previous verison.

The Serialize trait implementation only needs to support the latest version.

Versions and derive

The derive macro used by Savefile supports multiple versions of structs. To make this work, you have to add attributes whenever fields are removed, added or have their types changed.

When adding or removing fields, use the #[savefile_versions] attribute.

The syntax is one of the following:

#[savefile_versions = "N.."]  //A field added in version N
#[savefile_versions = "..N"]  //A field removed in version N+1. That is, it existed up to and including version N.
#[savefile_versions = "N..M"] //A field that was added in version N and removed in M+1. That is, a field which existed in versions N .. up to and including M.

Removed fields must keep their deserialization type. This is easiest accomplished by substituting their previous type using the Removed<T> type. Removed<T> uses zero space in RAM, but deserializes equivalently to T (with the result of the deserialization thrown away).

Savefile tries to validate that the Removed<T> type is used correctly. This validation is based on string matching, so it may trigger false positives for other types named Removed. Please avoid using a type with such a name. If this becomes a problem, please file an issue on github.

Using the #[savefile_versions] tag is critically important. If this is messed up, data corruption is likely.

When a field is added, its type must implement the Default trait (unless the default_val or default_fn attributes are used).

There also exists a savefile_default_val, a default_fn and a savefile_versions_as attribute. More about these below:

The versions attribute

Rules for using the #[savefile_versions] attribute:

  • You must keep track of what the current version of your data is. Let's call this version N.
  • You may only save data using version N (supply this number when calling save)
  • When data is loaded, you must supply version N as the memory-version number to load. Load will still adapt the deserialization operation to the version of the serialized data.
  • The version number N is "global" (called GLOBAL_VERSION in the previous source example). All components of the saved data must have the same version.
  • Whenever changes to the data are to be made, the global version number N must be increased.
  • You may add a new field to your structs, iff you also give it a #[savefile_versions = "N.."] attribute. N must be the new version of your data.
  • You may remove a field from your structs. If previously it had no #[savefile_versions] attribute, you must add a #[savefile_versions = "..N-1"] attribute. If it already had an attribute #[savefile_versions = "M.."], you must close its version interval using the current version of your data: #[savefile_versions = "M..N-1"]. Whenever a field is removed, its type must simply be changed to Removed where T is its previous type. You may never completely remove items from your structs. Doing so removes backward-compatibility with that version. This will be detected at load. For example, if you remove a field in version 3, you should add a #[savefile_versions="..2"] attribute.
  • You may not change the type of a field in your structs, except when using the savefile_versions_as-macro.

The default_val attribute

The default_val attribute is used to provide a custom default value for primitive types, when fields are added.

Example:


#[derive(Savefile)]
struct SomeType {
    old_field: u32,
    #[savefile_default_val="42"]
    #[savefile_versions="1.."]
    new_field: u32
}

In the above example, the field new_field will have the value 42 when deserializing from version 0 of the protocol. If the default_val attribute is not used, new_field will have u32::default() instead, which is 0.

The default_val attribute only works for simple types.

The default_fn attribute

The default_fn attribute allows constructing more complex values as defaults.


fn make_hello_pair() -> (String,String) {
    ("Hello".to_string(),"World".to_string())
}
#[derive(Savefile)]
struct SomeType {
    old_field: u32,
    #[savefile_default_fn="make_hello_pair"]
    #[savefile_versions="1.."]
    new_field: (String,String)
}

The ignore attribute

The ignore attribute can be used to exclude certain fields from serialization. They still need to be constructed during deserialization (of course), so you need to use one of the default-attributes to make sure the field can be constructed. If none of the default-attributes (described above) are used, savefile will attempt to use the Default trait.

Here is an example, where a cached value is not to be deserialized. In this example, the value will be 0.0 after deserialization, regardless of the value when serializing.


#[derive(Savefile)]
struct IgnoreExample {
    a: f64,
    b: f64,
    #[savefile_ignore]
    cached_product: f64
}

The savefile_versions_as attribute

The savefile_versions_as attribute can be used to support changing the type of a field.

Let's say the first version of our protocol uses the following struct:


#[derive(Savefile)]
struct Employee {
    name : String,
    phone_number : u64
}

After a while, we realize that a u64 is a really bad choice for datatype for a phone number, since it can't represent a number with leading 0, and also can't represent special characters which sometimes appear in phone numbers, like '+' or '-' etc.

So, we change the type of phone_number to String:


fn convert(phone_number:u64) -> String {
    phone_number.to_string()
}
#[derive(Savefile)]
struct Employee {
    name : String,
    #[savefile_versions_as="0..0:convert:u64"]
    #[savefile_versions="1.."]
    phone_number : String
}

This will cause version 0 of the protocol to be deserialized expecting a u64 for the phone number, which will then be converted using the provided function convert into a String.

Note, that conversions which are supported by the From trait are done automatically, and the function need not be specified in these cases.

Let's say we have the following struct:


#[derive(Savefile)]
struct Racecar {
    max_speed_kmh : u8,
}

We realize that we need to increase the range of the max_speed_kmh variable, and change it like this:


#[derive(Savefile)]
struct Racecar {
    #[savefile_versions_as="0..0:u8"]
    #[savefile_versions="1.."]
    max_speed_kmh : u16,
}

Note that in this case we don't need to tell Savefile how the deserialized u8 is to be converted to an u16.

Speeding things up

Now, let's say we want to add a list of all positions that our player have visited, so that we can provide a instant-replay function to our game. The list can become really long, so we want to make sure that the overhead when serializing this is as low as possible.

Savefile has an unsafe trait crate::ReprC that you can implement for a type T. This instructs Savefile to optimize serialization of Vec into being a very fast, raw memory copy.

This is dangerous. You, as implementor of the ReprC trait take full responsibility that all the following rules are upheld:

  • The type T is Copy
  • The host platform is little endian. The savefile disk format uses little endian. Automatic validation of this should
  • probably be added to savefile.
  • The type T is a struct or an enum without fields. Using it on enums with fields will probably lead to silent data corruption.
  • The type is represented in memory in an ordered, packed representation. Savefile is not clever enough to inspect the actual memory layout and adapt to this, so the memory representation has to be all the types of the struct fields in a consecutive sequence without any gaps. Note that the #[repr(C)] trait does not do this - it will include padding if needed for alignment reasons. You should not use #[repr(packed)], since that may lead to unaligned struct fields. Instead, you should use #[repr(C)] combined with manual padding, if necessary.
  • If the type is an enum, it must be #[repr(u8)] .

For example, don't do:

#[repr(C)]
struct Bad {
	f1 : u8,
	f2 : u32,
}

Since the compiler is likely to insert 3 bytes of padding after f1, to ensure that f2 is aligned to 4 bytes.

Instead, do this:

#[repr(C)]
struct Good {
	f1 : u8,
	pad1 :u8,
	pad2 :u8,
	pad3 :u8,
	f2 : u32,
}

And simpy don't use the pad1, pad2 and pad3 fields. Note, at time of writing, Savefile requires that the struct be free of all padding. Even padding at the end is not allowed. This means that the following does not work:

#[repr(C)]
struct Bad2 {
	f1 : u32,
	f2 : u8,
}

This restriction may be lifted at a later time.

Note that having a struct with bad alignment will be detected, at runtime, for debug-builds. It may not be detected in release builds. Serializing or deserializing each crate::ReprC struct at least once somewhere in your test suite is recommended.

extern crate savefile;
use savefile::prelude::*;

#[macro_use]
extern crate savefile_derive;

#[derive(ReprC, Clone, Copy, Savefile)]
#[repr(C)]
struct Position {
	x : u32,
	y : u32,
}

const GLOBAL_VERSION:u32 = 2;
#[derive(Savefile)]
struct Player {
    name : String,
    #[savefile_versions="0..0"] //Only version 0 had this field
    strength : Removed<u32>,
    inventory : Vec<String>,
    #[savefile_versions="1.."] //Only versions 1 and later have this field
    skills : Vec<String>,
    #[savefile_versions="2.."] //Only versions 2 and later have this field
    history : Vec<Position>
}

fn save_player(file:&'static str, player:&Player) {
    save_file(file, GLOBAL_VERSION, player).unwrap();
}

fn load_player(file:&'static str) -> Player {
    load_file(file, GLOBAL_VERSION).unwrap()
}

fn main() {
	let mut player = load_player("newsave.bin"); //Load from previous save
	player.history.push(Position{x:1,y:1});
	player.history.push(Position{x:2,y:1});
	player.history.push(Position{x:2,y:2});
	save_player("newersave.bin", &player);
}

Introspection

The Savefile crate also provides an introspection feature, meant for diagnostics. This is implemented through the trait Introspect. Any type implementing this can be introspected.

The savefile-derive crate supports automatically generating an implementation for most types.

The introspection is purely 'read only'. There is no provision for using the framework to mutate data.

Here is an example of using the trait directly:

extern crate savefile;
#[macro_use]
extern crate savefile_derive;
use savefile::Introspect;
use savefile::IntrospectItem;
#[derive(Savefile)]
struct Weight {
    value: u32,
    unit: String
}
#[derive(Savefile)]
struct Person {
    name : String,
    age: u16,
    weight: Weight,
}
fn main() {
    let a_person = Person {
        name: "Leo".into(),
        age: 8,
        weight: Weight { value: 26, unit: "kg".into() }
    };
    assert_eq!(a_person.introspect_len(), 3); //There are three fields
    assert_eq!(a_person.introspect_value(), "Person"); //Value of structs is the struct type, per default
    assert_eq!(a_person.introspect_child(0).unwrap().key(), "name"); //Each child has a name and a value. The value is itself a &dyn Introspect, and can be introspected recursively
    assert_eq!(a_person.introspect_child(0).unwrap().val().introspect_value(), "Leo"); //In this case, the child (name) is a simple string with value "Leo".
    assert_eq!(a_person.introspect_child(1).unwrap().key(), "age");
    assert_eq!(a_person.introspect_child(1).unwrap().val().introspect_value(), "8");
    assert_eq!(a_person.introspect_child(2).unwrap().key(), "weight");
    let weight = a_person.introspect_child(2).unwrap();
    assert_eq!(weight.val().introspect_child(0).unwrap().key(), "value"); //Here the child 'weight' has an introspectable weight obj as value
    assert_eq!(weight.val().introspect_child(0).unwrap().val().introspect_value(), "26");
    assert_eq!(weight.val().introspect_child(1).unwrap().key(), "unit");
    assert_eq!(weight.val().introspect_child(1).unwrap().val().introspect_value(), "kg");
}

Introspect Details

By using #[derive(SavefileIntrospectOnly)] it is possible to have only the Introspect-trait implemented, and not the serialization traits. This can be useful for types which aren't possible to serialize, but you still wish to have introspection for.

By using the #[savefile_introspect_key] attribute on a field, it is possible to make the generated crate::Introspect::introspect_value return the string representation of the field. This can be useful, to have the primary key (name) of an object more prominently visible in the introspection output.

Example:


#[derive(Savefile)]
pub struct StructWithName {
    #[savefile_introspect_key]
    name: String,
    value: String
}

Higher level introspection functions

There is a helper called crate::Introspector which allows to get a structured representation of parts of an introspectable object. The Introspector has a 'path' which looks in to the introspection tree and shows values for this tree. The advantage of using this compared to just using format!("{:#?}",mystuff) is that for very large data structures, unconditionally dumping all data may be unwieldy. The author has a state struct which becomes hundres of megabytes when formatted using the Debug-trait in this way.

An example:


extern crate savefile;
#[macro_use]
extern crate savefile_derive;
use savefile::Introspect;
use savefile::IntrospectItem;
use savefile::prelude::*;
#[derive(Savefile)]
struct Weight {
    value: u32,
    unit: String
}
#[derive(Savefile)]
struct Person {
    name : String,
    age: u16,
    weight: Weight,
}
fn main() {
    let a_person = Person {
        name: "Leo".into(),
        age: 8,
        weight: Weight { value: 26, unit: "kg".into() }
    };

    let mut introspector = Introspector::new();

    let result = introspector.do_introspect(&a_person,
        IntrospectorNavCommand::SelectNth{select_depth:0, select_index: 2}).unwrap();

    println!("{}",result);
    /*
    Output is:

Introspectionresult:
 name = Leo
 age = 8
*weight = Weight
   value = 26
   unit = kg

    */
    // Note, that there is no point in using the Introspection framework just to get
    // a debug output like above, the point is that for larger data structures, the
    // introspection data can be programmatically used and shown in a live updating GUI,
    // or possibly command line interface or similar. The [crate::IntrospectionResult] does
    // implement Display, but this is just for convenience.

}

The crate::Introspector object can be used to navigate inside an object being introspected. A GUI-program could allow an operator to use arrow keys to navigate the introspected object.

Every time crate::Introspector::do_introspect is called, a crate::IntrospectorNavCommand is given which can traverse the tree downward or upward. In the example in the previous chapter, SelectNth is used to select the 2nd children at the 0th level in the tree.

Modules

prelude

The prelude contains all definitions thought to be needed by typical users of the library

Structs

Canary1

Useful zero-sized marker. It serializes to a magic value, and verifies this value on deserialization. Does not consume memory data structure. Useful to troubleshoot broken Serialize/Deserialize implementations.

CryptoReader

A cryptographic stream wrapper. Wraps a plain dyn Read, and itself implements Read, decrypting and verifying all data read.

CryptoWriter

A cryptographic stream wrapper. Wraps a plain dyn Write, and itself implements Write, encrypting all data written.

Deserializer

Object from which bytes to be deserialized are read. This is basically just a wrapped std::io::Read object, the version number of the file being read, and the current version number of the data structures in memory.

Field

A field is serialized according to its value. The name is just for diagnostics.

IntrospectItemMutex

Type of single child of introspector for Mutex

IntrospectItemRwLock

Type of single child of introspector for RwLock

IntrospectItemSimple

Standard child for Introspect trait. Simply owned key string and reference to dyn Introspect

IntrospectedElement

A node in the introspection tree

IntrospectedElementKey

Identifies an introspected element somewhere in the introspection tree of an object.

IntrospectionFrame

All fields at a specific depth in the introspection tree

IntrospectionResult

An introspection tree. Note that each node in the tree can only have one expanded field, and thus at most one child (a bit of a boring 'tree' :-) ).

Introspector

A helper which allows navigating an introspected object. It remembers a path down into the guts of the object.

Removed

Helper struct which represents a field which has been removed

SchemaArray

An array is serialized by serializing its items one by one, without any padding. The dbg_name is just for diagnostics.

SchemaEnum

An enum is serialized as its u8 variant discriminator followed by all the field for that variant. The name of each variant, as well as its order in the enum (the discriminator), is significant.

SchemaStruct

A struct is serialized by serializing its fields one by one, without any padding. The dbg_name is just for diagnostics.

Serializer

Object to which serialized data is to be written. This is basically just a wrapped std::io::Write object and a file protocol version number.

Variant

An enum variant is serialized as its fields, one by one, without any padding.

Enums

IntrospectionError

Ways in which introspection may fail

IntrospectorNavCommand

A command to navigate within an introspected object

SavefileError

This object represents an error in deserializing or serializing an item.

Schema

The schema represents the save file format of your data structure. It is an AST (Abstract Syntax Tree) for consisting of various types of nodes in the savefile format. Custom Serialize-implementations cannot add new types to this tree, but must reuse these existing ones. See the various enum variants for more information:

SchemaPrimitive

A primitive is serialized as the little endian representation of its type, except for string, which is serialized as an usize length followed by the string in utf8.

Constants

MAX_CHILDREN

As a sort of guard against infinite loops, the default 'len'-implementation only ever iterates this many times. This is so that broken 'introspect_child'-implementations won't case introspect_len to iterate forever.

Traits

Deserialize

This trait must be implemented for all data structures you wish to be able to deserialize.

Introspect

Gives the ability to look into an object, inspecting any children (fields).

IntrospectItem

A child of an object implementing Introspect. Is a key-value pair. The only reason this is not simply (String, &dyn Introspect) is that Mutex wouldn't be introspectable in that case. Mutex needs something like (String, MutexGuard). By having this a trait, different types can have whatever reference holder needed (MutexGuard, RefMut etc).

ReprC

This is a marker trait for types which have an in-memory layout that is packed and therefore identical to the layout that savefile will use on disk. This means that types for which this trait is implemented can be serialized very quickly by just writing their raw bits to disc.

Serialize

This trait must be implemented for all data structures you wish to be able to serialize. To actually serialize data: create a Serializer, then call serialize on your data to save, giving the Serializer as an argument.

WithSchema

This trait must be implemented by all data structures you wish to be able to save. It must encode the schema for the datastructure when saved using the given version number. When files are saved, the schema is encoded into the file. when loading, the schema is inspected to make sure that the load will safely succeed. This is only for increased safety, the file format does not in fact use the schema for any other purpose, the design is schema-less at the core, the schema is just an added layer of safety (which can be disabled).

Functions

introspect_item

Create a default IntrospectItem with the given key and Introspect.

load

Deserialize an instance of type T from the given reader . The current type of T in memory must be equal to version. The deserializer will use the actual protocol version in the file to do the deserialization.

load_encrypted_file

Like crate::load_file, except it expects the file to be an encrypted file previously stored using crate::save_encrypted_file.

load_file

Like crate::load , except it deserializes from the given file in the filesystem. This is a pure convenience function.

load_file_noschema

Like crate::load_noschema , except it deserializes from the given file in the filesystem. This is a pure convenience function.

load_from_mem

Deserialize an instance of type T from the given u8 slice . The current type of T in memory must be equal to version. The deserializer will use the actual protocol version in the file to do the deserialization.

load_noschema

Like crate::load , but used to open files saved without schema, by one of the _noschema versions of the save functions.

save

Write the given data to the writer. The current version of data must be version.

save_compressed

Write the given data to the writer. Compresses data using 'snappy' compression format. The current version of data must be version. The resultant data can be loaded using the regular load-function (it autodetects if compressions was active or not).

save_encrypted_file

Like crate::save_file, except encrypts the data with AES256, using the SHA256 hash of the password as key.

save_file

Like crate::save , except it opens a file on the filesystem and writes the data to it. This is a pure convenience function.

save_file_noschema

Like crate::save_noschema , except it opens a file on the filesystem and writes the data to it. This is a pure convenience function.

save_noschema

Write the given data to the writer. The current version of data must be version. Do this write without writing any schema to disk. As long as all the serializers and deserializers are correctly written, the schema is not necessary. Omitting the schema saves some space in the saved file, but means that any mistake in implementation of the Serialize or Deserialize traits will cause hard-to-troubleshoot data corruption instead of a nice error message.

save_to_mem

Serialize the given data and return as a Vec The current version of data must be version.