ChaCha8Rand

Struct ChaCha8Rand 

Source
pub struct ChaCha8Rand { /* private fields */ }
Expand description

A deterministic stream of pseudorandom bytes from a 32-byte seed.

See the crate documentation for a higher-level introduction and quick-start examples. Here you’ll only find excessive extra details about reproducibility and some notes about (de-)serialization and SIMD backends.

This type implements traits from the rand crate (RngCore and SeedableRng), but you need to opt-in with a feature flag to use those impls.

§ Reproducibility

The ChaCha8Rand specification describes how a seed is expanded into an unbounded stream of pseudorandom bytes. This stream should be uniquely determined: byte order is fixed to little endian, the differences between various ChaCha20 variants (32- or 64-bit counter, nonce size) don’t matter in this context, and the test vector included in the spec should remove any remaining doubts.

Until the 1.0 release of this crate, I reserve the right to make API breaking changes and fix bugs even if they change the output. But the intent is to match the spec precisely and not change anything about the output for a given seed in future releases. If the spec gets an incompatible 2.0 release and I want to implement it, that will be a semver-major release. Note that the spec technically hasn’t been tagged as 1.0, but breaking changes seem very unlikely since the same people already shipped an implementation in the Go standard library.

Besides treating the generator as a byte stream with ChaCha8Rand::read_bytes, you can also use other methods such as ChaCha8Rand::read_u64. What happens when you interleave calls to these methods, i.e., mix and match different read granularities? There’s no clear “best” answer. Different implementation strategies lead to different behavior and it’s reasonable to not specify it or reserve the right to change it later. However, for this crate I wanted to commit to a simple and useful mental model. What I ended up with is:

  • The generator is just the spec-mandated stream of bytes. Repeatedly calling read_bytes gives you these bytes in order without skipping, reordering, or duplicating anything.
  • The number of calls to read_bytes and the size of each read doesn’t affect behavior. The number of bytes consumed is never rounded up internally because that would skip some bytes. Zero-sized reads are no-ops.
  • Methods like read_u32, read_u64, read_seed, and any others that might be added in the future, behave exactly like reading the appropriate number of bytes from the stream and converting those to the result type. When byte order matters, this always uses little endian.

This is different from what Go’s implementation does when you interleave calls to its Uint64 and Read methods. The documentation explicitly says the results are unspecified and may return bytes “out of order”. The implementation in Go 1.23 does in fact behave differently from this crate in many cases. (It also doesn’t provide a direct way to read a 32-bit integer.)

§Serialization and Deserialization

Besides storing the initial seed, you can also snapshot state of the generator at any point in time with ChaCha8Rand::clone_state and ChaCha8Rand::try_restore_state. See ChaCha8State for more details. The important thing with respect to reproducibility is that the serialized state records an exact position in the output byte stream. Thus, if you save the state at any point and later restore it, you’ll get the same output as if you had kept working with the original generator, regardless of how how you read from it before and after.

§SIMD Backends

This crate has all the same SIMD code paths as Go 1.23 and then some more:

All backends except Webassembly support runtime feature detection with the std crate feature. More backends may be added in the future as Rust stabilizes the corresponding core::arch intrinsics. Of course, there’s also a portable scalar backend for all platforms without SIMD backends.

Implementations§

Source§

impl ChaCha8Rand

Source

pub fn new(seed: &[u8; 32]) -> Self

Create a new generator from the given seed.

This will eagerly generates data to fill the generator’s internal buffer. Therefore, it may be a bit wasteful to call if you won’t actually need any output from the generator. Don’t over-complicate your program to avoid that, but keep it in mind if in case it’s easy to avoid.

§Examples

Reproducing the sample output from the ChaCha8Rand specification:

let mut sample = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
assert_eq!(sample.read_u64(), 0xb773b6063d4616a5);
assert_eq!(sample.read_u64(), 0x1160af22a66abc3c);
assert_eq!(sample.read_u64(), 0x8c2599d9418d287c);
// ... and so on
Source

pub fn set_seed(self: &mut ChaCha8Rand, seed: &[u8; 32])

Reset this generator, as if overwriting it with ChaCha8Rand::new(seed).

This helper method exists is because it’s more convenient sometimes and might avoid copying a relatively large type from one location to another.

§Examples

Restoring the original seed after a simulation to reproduce it:

fn run_simulation(rng: &mut ChaCha8Rand) -> String {
    // TODO: figure out the details
    format!("the meaning of life is {}", rng.read_u64())
}

let mut rng = ChaCha8Rand::new(&initial_seed);
let result = run_simulation(&mut rng);
println!("Simulation says {result} - let's try to replicate that");
rng.set_seed(&initial_seed);
let result_again = run_simulation(&mut rng);
assert_eq!(result, result_again);
Source

pub fn read_u32(&mut self) -> u32

Consume four bytes of uniformly random data and return them as u32.

This is always equivalent to ChaCha8Rand::read_bytes plus u32::from_le_bytes, but 99% of the time it’s more efficient. If you simply need 32 or fewer uniformly random bits, this method enables this conveniently and without involving the rand_* crates.

On the other hand, if you want integers in a range like 0..n or m..=n, you should not use this method and combine it with the remainder operator %. The rand crate has convenient and efficient APIs for doing that correctly, without introducing bias. It also supports more data types, non-uniform distributions, and higher-level operations such as shuffling lists. You can use it with ChaCha8Rand by activating the crate feature so that ChaCha8Rand implements the rand traits. See the examples for more details.

§Examples

To generate integers in some range 0..n or 0..=n, or to generate other types such as floating point numbers, combine ChaCha8Rand with the rand crate (or another implementation of the same algorithms).

// This example is not tested automatically because it doesn't
// compile when the `rand_core_0_9` feature is disabled.
use chacha8rand::ChaCha8Rand; // with rand_core_0_9 feature
use rand::prelude::*; // rand version 0.9

let mut rng = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
if rng.gen_ratio(2, 3) {
    println!("Nice weather we're having :)");
} else {
    println!("Awful weather, isn't it?");
}
let chan = rng.gen_range(1..100);
let celsius = rng.gen_range(-5.0..=35.0);
println!("Channel {chan} News said it'll be {celsius:.1} degrees tomorrow.");

Taking the remainder modulo n to get an integer in 0..n should be avoided because it introduces bias when n is not a power of two. This is easier to see when you try all the possibilities with a smaller number of bits than 32, e.g., with five bits and three options:

let mut remainder_histogram = [0, 0, 0];
for five_bit_number in 0..(1 << 5) {
    remainder_histogram[five_bit_number % 3] += 1;
}
assert_eq!(remainder_histogram, [11, 11, 10]);

In this example, the results 0 and 1 each have probability 11 / 32 = 34.375% and the result 2 has probability 10 / 32 = 31.25% instead of the desired 33.333..% for each.

It may appear that the bias becomes very small when you use 32 bits instead of just five, but it can still cause problems at larger scales. Consider a scenario where you choose among n = (u32::MAX / 3) * 2 (ca. 2.86 billion) items via read_u32() % n. For any given item, the odds of being chosen are already very small, with or without the bias. However, if you choose a few hundred items and look at them as a whole, you’ll notice that roughly half of them are from the first third of the range 0..n, and the other half are spread out across the rest of the range.

More fully featured libraries like rand implement sampling algorithms that avoid this problem. They’re also usually more efficient than computing the remainder, which is a relatively expensive operation even on modern CPUs.

At other times, 32 bits is exactly what you need. For example, tabulation hashing with 32 bit output needs a table of random 32 bit integers. To hash a 64 bit value this way, we could split it into 16 pieces of 4 bits each and use a table of 16 x 2^4 random u32s:

struct U64Hasher {
    table: [[u32; 16]; 16],
}

impl U64Hasher {
    fn new(rng: &mut ChaCha8Rand) -> Self {
        let mut table = [[0; 16]; 16];
        for row in &mut table {
            for cell in row {
                *cell = rng.read_u32();
            }
        }
        Self { table }
    }

    fn hash(&self, mut value: u64) -> u32 {
        let mut hash = 0;
        for row in &self.table {
            hash ^= row[(value & 0xF) as usize];
            value >>= 4;
        }
        debug_assert_eq!(value, 0);
        hash
    }
}

let mut rng = ChaCha8Rand::new(&seed);
let hasher = U64Hasher::new(&mut rng);
assert_ne!(hasher.hash(1 << 32), hasher.hash(1 << 36));
Source

pub fn read_u64(&mut self) -> u64

Consume eight bytes of uniformly random data and return them as u64.

This is always equivalent to ChaCha8Rand::read_bytes plus u64::from_le_bytes, but 99% of the time it’s more efficient. If you simply need 64 or fewer uniformly random bits, this method enables this conveniently and without involving the rand_* crates.

As discussed in the the 32-bit variant, you can and should use ChaCha8Rand with the rand crates for bounded integers in a range such as 0..n or m..=n, to generate floating-point numbers and sample non-uniform distributions, to shuffle lists, and so on.

§Examples

With 64 bits, we can generate a 8x8 bitmap and render it as ASCII art. Clearly that’s much better than a smaller (and non-square) bitmap with only 32 “pixels”.

let mut rng = ChaCha8Rand::new(&seed);
let mut bitmap = rng.read_u64();
for _row in 0..8 {
    for _column in 0..8 {
        let pixel = ['X', '.'][(bitmap & 1) as usize];
        print!("{pixel}");
        bitmap >>= 1;
    }
    println!();
}

A more computer science-minded example would be strongly universal hashing of 32-bit integers into 32 or fewer bits. The strongly universal multiply-shift scheme by Dietzfelbinger needs two random, independent 64-bit parameters a and b:

struct MulAddShift {
    a: u64,
    b: u64,
}

impl MulAddShift {
    fn hash(&self, x: u32) -> u32 {
        (u64::from(x).wrapping_mul(self.a).wrapping_add(self.b) >> 32) as u32
    }
}

let mut rng = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
let h = MulAddShift {
    a: rng.read_u64(),
    b: rng.read_u64()
};
// Truncating the outputs to two bits also gives a strongly universal family.
// Strong universality implies uniformity - all hash values are equally likely.
assert_eq!(h.hash(0) % 4, 2);
assert_eq!(h.hash(1) % 4, 0);
assert_eq!(h.hash(2) % 4, 3);
assert_eq!(h.hash(3) % 4, 1);

It’s just a happy coincidence that we got every two-bit output exactly once in this small example, with this exact seed and these exact inputs. To be honest, I was a bit surprised that it worked out so perfectly.

Source

pub fn read_bytes(&mut self, dest: &mut [u8])

Consume uniformly random bytes and write them into dest.

This method is, in some sense, the most foundational way of using the generator. Other read_* methods behave as-if they read however many bytes they require, but they’re more convenient and often more efficient than reading a small number of bytes manually.

On the other hand, when you need a large chunk of randomness (hundreds of bytes or more), reading into a large buffer is very efficient because it boils down to one or several memcpys from the generator’s internal buffer. With a large enough buffer, this can produce several gigabytes per second with 128-bit SIMD, and the AVX2 backend goes roughly twice as fast. (For read_u32 and read_u64, the difference is much more modest.)

§Example

You can use this to derive a new 32-byte seed for another ChaCha8Rand instance, but the ChaCha8Rand::read_seed helper makes this more convenient, so see examples there.

Other use cases require more or fewer bytes. For example, random (v4) UUIDs are great for assigning arbitrary names to objects or events while avoiding collisions with other people (or their computers) doing the same thing. In most cases, you should generate a UUID directly from OS-provided entropy, which the uuid crate supports with Uuid::new_v4(). But in some cases it’s more convenient to get a high-entropy seed from the OS at startup and feed it into a high-quality userspace RNG to create lots of UUIDs:

use uuid::Uuid;
// For this use case, you *always* need a high-entropy seed!
// A low-entropy seed (current time, chosen by humans, etc.), reusing a seed,
// or cloning the generator leads to many colliding "UUIDs".
let mut seed = [0; 32];
getrandom::fill(&mut seed).unwrap();
let mut rng = ChaCha8Rand::new(&seed);
let mut uuid_bytes = [0u8; 16];
rng.read_bytes(&mut uuid_bytes);
let uuid = Uuid::from_bytes(uuid_bytes);
Source

pub fn read_seed(&mut self) -> [u8; 32]

Consume 32 uniformly random bytes, suitable for seeding another RNG instance.

This is a simple wrapper around read_bytes, but returning an array by value is convenient when you want to use it as a seed. The ChaCha8Rand algorithm already replaces its seed with some of its earlier output after every iteration (992 bytes of output + 32 bytes of new seed). In this sense, it’s not even that strange to use the other output in the same way. There’s usually no point in manually re-seeding the same generator instance, but it’s often useful to derive several independent generators from a “root” seed.

Of course, the seed could also be used with any other PRNG algorithm that accepts 32-byte seeds. However, most generators that accept [u8; 32] as seed also use stream or block ciphers to generate large batches of data. There are some statistical generators that happen to have exactly 32 bytes of state, but these usually want 32- or 64-bit integers instead of raw bytes. In that case you might just call read_u32 or read_u64 a few times.

§Examples

In general, having multiple generator instances is useful when you want some domain separation. A Monte Carlo simulation that should be reproducible from a seed might also rely on some Las Vegas algorithms for auxillary tasks that don’t affect the simulation’s outcome (e.g., randomized quicksort or building a perfect hash table). If everything shares a single generator, any change to how much (or when) randomness is consumed will affect reproducibility of the simulation. On the other hand, debugging may be easier if the rest of the program also depends on the seed. In such cases, you could stretch a “root” seed into multiple seeds (without involving additional algorithms such as key derivation functions):

let root_seed = b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456";
let mut root_rng = ChaCha8Rand::new(root_seed);
let mut sim_rng = ChaCha8Rand::new(&root_rng.read_seed());
let mut qsort_rng = ChaCha8Rand::new(&root_rng.read_seed());
drop(root_rng); // no longer needed
let steps = sim_rng.read_u32();
for _ in 0..steps {
    // ... generate some data using `sim_rng` ...
    quicksort_with_random_pivot(
        &mut intermediate_results,
        &mut qsort_rng,
    );
    // ... use the sorted data ...
}

Another example comes from roguelike games that use a single seed to drive procedural generation as well as the blow-by-blow game play. If one seeds leads to particularly interesting outcomes, players may want to share and reuse it. This falls flat if any change in how much randomness is consumed avalanches into entirely different outcomes down the line. For example, if parts of the map are generated on the fly, they might look entirely different depending on how many turns the player took to reach them. Instead, you can set up one generator for every aspect of the game’s randomness, and derive seeds for all of them from the seed that the player deals with. Just make sure you don’t accidentally use the same seed for each of the generators.

struct GameStateGodObject {
    map_rng: ChaCha8Rand,
    encounter_rng: ChaCha8Rand,
    ai_rng: ChaCha8Rand,
    event_rng: ChaCha8Rand,
    // ...
}

impl GameStateGodObject {
    fn new(seed: [u8; 32]) -> Self {
        let mut root = ChaCha8Rand::new(&seed);
        let mut derive_rng = || ChaCha8Rand::new(&root.read_seed());
        Self {
            map_rng: derive_rng(),
            encounter_rng: derive_rng(),
            ai_rng: derive_rng(),
            event_rng: derive_rng(),
            // ...
        }
    }
}
Source

pub fn clone_state(&self) -> ChaCha8State

Take a snapshot of the generator’s current state.

See ChaCha8State for more details and an example.

Source

pub fn try_restore_state( &mut self, state: &ChaCha8State, ) -> Result<(), RestoreStateError>

Restore the generator’s state from a snapshot taken before.

See the documentation of ChaCha8State for more details and an example.

§Errors

This function never fails if state came from ChaCha8Rand::clone_state and was not modified. Otherwise (e.g., if you deserialize it from a file that someone fiddled with), it may fail because the bytes_consumed field is out of range. This field refers to a single iteration of ChaCha8Rand, which always produces 992 bytes of output. Thus, valid values are in the range 0..=992.

Trait Implementations§

Source§

impl Clone for ChaCha8Rand

Source§

fn clone(&self) -> ChaCha8Rand

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ChaCha8Rand

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.