Heap data estimator.

The datasize crate allows estimating the amount of heap memory used by a value. It does so by providing or deriving an implementation of the DataSize trait, which knows how to calculate the size for many std types and primitives.

The aim is to get a reasonable approximation of memory usage, especially with variably sized types like Vecs. While it is acceptable to be a few bytes off in some cases, any user should be able to easily tell whether their memory is growing linearly or logarithmically by glancing at the reported numbers.

The crate does not take alignment or memory layouts into account, or unusual behavior or optimizations of allocators. It is depending entirely on the data inside the type, thus the name of the crate.

General usage

For any type that implements DataSize, the data_size convenience function can be used to guess the size of its heap allocation:

use datasize::data_size;

let data: Vec<u64> = vec![1, 2, 3];
assert_eq!(data_size(&data), 24);

Types implementing the trait also provide two additional constants, IS_DYNAMIC and STATIC_HEAP_SIZE.

IS_DYNAMIC indicates whether a value's size can change over time:

use datasize::DataSize;

// A `Vec` of any kind may have elements added or removed, so it changes size.
assert!(Vec::<u64>::IS_DYNAMIC);

// The elements of type `u64` in it are not dynamic. This allows the implementation to
// simply estimate the size as number_of_elements * size_of::<u64>.
assert!(!u64::IS_DYNAMIC);

Additionally, STATIC_HEAP_SIZE indicates the amount of heap memory a type will always use. A good example is a Box<u64> -- it will always use 8 bytes of heap memory, but not change in size:

use datasize::DataSize;

assert_eq!(Box::<u64>::STATIC_HEAP_SIZE, 8);
assert!(!Box::<u64>::IS_DYNAMIC);

Implementing `DataSize` for custom types

The DataSize trait can be implemented for custom types manually:

# use datasize::{DataSize, data_size};
struct MyType {
items: Vec<i64>,
flag: bool,
counter: Box<u64>,
}

impl DataSize for MyType {
// `MyType` contains a `Vec`, so `IS_DYNAMIC` is set to true.
const IS_DYNAMIC: bool = true;

// The only always present heap item is the `counter` value, which is 8 bytes.
const STATIC_HEAP_SIZE: usize = 8;

#[inline]
fn estimate_heap_size(&self) -> usize {
// We can be lazy here and delegate to all the existing implementations:
data_size(&self.items) + data_size(&self.flag) + data_size(&self.counter)
}
}

let my_data = MyType {
items: vec![1, 2, 3],
flag: true,
counter: Box::new(42),
};

// Three i64 and one u64 on the heap sum up to 32 bytes:
assert_eq!(data_size(&my_data), 32);

Since implementing this for struct types is cumbersome and repetitive, the crate provides a DataSize macro for convenience:

# use datasize::{DataSize, data_size};
// Equivalent to the manual implementation above:
#[derive(DataSize)]
struct MyType {
items: Vec<i64>,
flag: bool,
counter: Box<u64>,
}
# let my_data = MyType {
#     items: vec![1, 2, 3],
#     flag: true,
#     counter: Box::new(42),
# };
# assert_eq!(data_size(&my_data), 32);

See the DataSize macro documentation in the datasize_derive crate for details.

Performance considerations

Determining the full size of data can be quite expensive, especially if multiple nested levels of dynamic types are used. The crate uses IS_DYNAMIC and STATIC_HEAP_SIZE to optimize when it can, so in many cases not every element of a vector needs to be checked individually.

However, if the contained types are dynamic, every element must (and will) be checked, so keep this in mind when performance is an issue.

Handlings references, `Arc`s and similar types

Any reference will be counted as having a data size of 0, as it does not own the value. There are some special reference-like types like Arc, which are discussed below.

`Arc` and `Rc`

Currently Arcs are not supported. A planned development is to allow users to mark an instance of an Arc as "primary" and have its heap memory usage counted, but currently this is not implemented.

Any Arc will be estimated to have a heap size of 0, to avoid cycles resulting in infinite loops.

The Rc type is handled in the same manner.

Additional types

Some additional types from external crates are available behind feature flags.

fake_clock-types: Support for the fake_instant::FakeClock type.
futures-types: Some types from the futures crate.
smallvec-types: Support for the smallvec::SmallVec type.
tokio-types: Some types from the tokio crate.

Known issues

The derive macro currently does not support generic structs with inline type bounds, e.g.

struct Foo<T: Copy> { ... }

This can be worked around by using an equivalent where clause:

struct Foo<T>
where T: Copy
{ ... }

datasize 0.1.0