Crate smallnum[][src]

Expand description

Compile-time size optimization for numeric primitives. Macros return smallest numeric type capable of fitting a static bounds. For unsigned integers, macro input is a maximum. For signed integers, macro input may be a maximum or a minimum.

  • Can save memory at zero runtime cost.
  • Embedded-friendly: !#[no_std].
  • Safe: #![forbid(unsafe_code)].

What is this for?

  • Saving memory: aiding the compiler in memory layout optimization (aka “struct packing”).

    • See zero-cost examples below.
  • Improving ergonomics: creating APIs that abstract away the underlying integer type.

    • E.g. method params/returns use usize, but stored as u16 internally. Upcast is free. Downcast checks for loss of precision.

Doesn’t #[repr(packed)] already save memory?

Not safely. The difference is subtle but important:

  • #[repr(packed)] removes all padding between struct fields. This incurs a performance penalty for misaligned accesses at best, and causes undefined behavior at worst. It’s something you generally want to avoid.

  • smallnum aids packing while maintaining the target’s native alignment, without removing padding. It can actually improve [data cache] performance while being fully safe.

For extreme size optimization, you’re free to combine smallnum with #[repr(packed)].

Example: Collection Index

When the size of a collection is known at compile-time, the variable used to index it can be size-optimized.

  • Target: Value for collection/container index operator
  • Yield: x * 1 where:
    • x < size_of<usize>()
use smallnum::{small_unsigned, SmallUnsigned};
use core::mem::size_of_val;

const MAX_SIZE: usize = 500;
let mut my_array: [u8; MAX_SIZE] = [0x00; MAX_SIZE];

let idx: usize = 5;
let small_idx: small_unsigned!(MAX_SIZE) = 5;

// Equivalent values
my_array[idx] = 0xff;
assert_eq!(my_array[idx], my_array[small_idx.usize()]);

// Memory savings (6 bytes on a 64-bit system)
#[cfg(target_pointer_width = "64")]
assert_eq!(size_of_val(&idx) - size_of_val(&small_idx), 6);

Notice that having the trait SmallUnsigned in scope allows small_idx.usize() to be called. This function returns a usize for convenient indexing, regardless of which type the macro selected (u16 in the above example, hence the 6 byte savings over a 64-bit host’s u64).

Example: Tree Node Metadata

When the maximum capacity of a tree is known at compile time, metadata stored in every node can be size-optimized.

  • Target: Internal metatdata
  • Yield: x * n where:
    • x <= size_of<usize>()
    • n == node_cnt
use smallnum::small_unsigned;
use core::mem::size_of;

const MAX_CAPACITY: usize = 50_000;

// Regular node in a binary tree
pub struct BinTree<T> {
    value: T,
    left_child: Option<Box<BinTree<T>>>,
    right_child: Option<Box<BinTree<T>>>,
    subtree_size: usize,
}

// Node with size-optimized metadata
pub struct SmallBinTree<T> {
    value: T,
    left_child: Option<Box<SmallBinTree<T>>>,
    right_child: Option<Box<SmallBinTree<T>>>,
    subtree_size: small_unsigned!(MAX_CAPACITY),
}

// Per-node memory savings (8 bytes on a 64-bit system)
#[cfg(target_pointer_width = "64")]
assert_eq!(size_of::<BinTree<i16>>() - size_of::<SmallBinTree<i16>>(), 8);

Example: Index-based Graphs

When implementing an {index,arena}-based graph whose maximum capacity is known at compile-time, indexes stored in every structure (edge or node) can be size-optimized.

  • Target: Internal “pointer” representation
  • Yield: (x + y) * n where:
    • x <= size_of<usize>()
    • y <= size_of<Option<usize>>()
    • n == edge_cnt
use smallnum::small_unsigned;
use core::mem::size_of;

const MAX_CAPACITY: usize = 50_000;

// Based on "Modeling graphs in Rust using vector indices" by Niko Matsakis (April 2015)
// http://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/

// Unoptimized indexes
pub type NodeIdx = usize;
pub type EdgeIdx = usize;

pub struct EdgeData {
    target: NodeIdx,
    next_outgoing_edge: Option<EdgeIdx>
}

// Optimized indexes
pub type SmallNodeIdx = small_unsigned!(MAX_CAPACITY);
pub type SmallEdgeIdx = small_unsigned!(MAX_CAPACITY);

pub struct SmallEdgeData {
    target: SmallNodeIdx,
    next_outgoing_edge: Option<SmallEdgeIdx>
}

// Per-edge memory savings (18 bytes on a 64-bit system)
#[cfg(target_pointer_width = "64")]
assert_eq!(size_of::<EdgeData>() - size_of::<SmallEdgeData>(), 18);

Advanced Examples

See examples/ directory, cargo run --example <file_name>.

Macro <-> Type Selection Set

License and Contributing

Licensed under the MIT license. Contributions are welcome!

Macros

Return smallest signed type capable of representing input value (positive, i.e. maximum, or negative, i.e. minimum).

Return a label corresponding to the smallest type capable of representing input value (positive, i.e. maximum, or negative, i.e. minimum).

Return smallest unsigned type capable of representing input value (positive, i.e. maximum).

Return a label corresponding to the smallest type capable of representing input value (positive, i.e. maximum).

Enums

Labels for signed integer primitives.

Labels for unsigned integer primitives.

Traits

Convenience trait for signed normalization (e.g. to/from isize).

Convenience trait for unsigned normalization (e.g. to/from usize).