pub struct ChaCha8Rand { /* private fields */ }Expand description
A deterministic stream of pseudorandom bytes from a 32-byte seed.
See the crate documentation for a higher-level introduction and quick-start examples. Here you’ll only find excessive extra details about reproducibility and some notes about (de-)serialization and SIMD backends.
This type implements traits from the rand crate (RngCore and SeedableRng), but you need to
opt-in with a feature flag to use those impls.
§ Reproducibility
The ChaCha8Rand specification describes how a seed is expanded into an unbounded stream of pseudorandom bytes. This stream should be uniquely determined: byte order is fixed to little endian, the differences between various ChaCha20 variants (32- or 64-bit counter, nonce size) don’t matter in this context, and the test vector included in the spec should remove any remaining doubts.
Until the 1.0 release of this crate, I reserve the right to make API breaking changes and fix bugs even if they change the output. But the intent is to match the spec precisely and not change anything about the output for a given seed in future releases. If the spec gets an incompatible 2.0 release and I want to implement it, that will be a semver-major release. Note that the spec technically hasn’t been tagged as 1.0, but breaking changes seem very unlikely since the same people already shipped an implementation in the Go standard library.
Besides treating the generator as a byte stream with ChaCha8Rand::read_bytes, you can also
use other methods such as ChaCha8Rand::read_u64. What happens when you interleave calls to
these methods, i.e., mix and match different read granularities? There’s no clear “best” answer.
Different implementation strategies lead to different behavior and it’s reasonable to not
specify it or reserve the right to change it later. However, for this crate I wanted to commit
to a simple and useful mental model. What I ended up with is:
- The generator is just the spec-mandated stream of bytes. Repeatedly calling
read_bytesgives you these bytes in order without skipping, reordering, or duplicating anything. - The number of calls to
read_bytesand the size of each read doesn’t affect behavior. The number of bytes consumed is never rounded up internally because that would skip some bytes. Zero-sized reads are no-ops. - Methods like
read_u32,read_u64,read_seed, and any others that might be added in the future, behave exactly like reading the appropriate number of bytes from the stream and converting those to the result type. When byte order matters, this always uses little endian.
This is different from what Go’s implementation does when you interleave calls to its Uint64
and Read methods. The documentation explicitly says the results are unspecified and may return
bytes “out of order”. The implementation in Go 1.23 does in fact behave differently from this
crate in many cases. (It also doesn’t provide a direct way to read a 32-bit integer.)
§Serialization and Deserialization
Besides storing the initial seed, you can also snapshot state of the generator at any point in
time with ChaCha8Rand::clone_state and ChaCha8Rand::try_restore_state. See
ChaCha8State for more details. The important thing with respect to reproducibility is that
the serialized state records an exact position in the output byte stream. Thus, if you save the
state at any point and later restore it, you’ll get the same output as if you had kept working
with the original generator, regardless of how how you read from it before and after.
§SIMD Backends
This crate has all the same SIMD code paths as Go 1.23 and then some more:
- SSE2 and AVX2 on x86_64 and 32-bit x86
- NEON on AArch64 (little-endian only for now)
- simd128 on Webassembly
All backends except Webassembly support runtime feature detection with the std crate feature.
More backends may be added in the future as Rust stabilizes the corresponding core::arch
intrinsics. Of course, there’s also a portable scalar backend for all platforms without SIMD
backends.
Implementations§
Source§impl ChaCha8Rand
impl ChaCha8Rand
Sourcepub fn new(seed: &[u8; 32]) -> Self
pub fn new(seed: &[u8; 32]) -> Self
Create a new generator from the given seed.
This will eagerly generates data to fill the generator’s internal buffer. Therefore, it may be a bit wasteful to call if you won’t actually need any output from the generator. Don’t over-complicate your program to avoid that, but keep it in mind if in case it’s easy to avoid.
§Examples
Reproducing the sample output from the ChaCha8Rand specification:
let mut sample = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
assert_eq!(sample.read_u64(), 0xb773b6063d4616a5);
assert_eq!(sample.read_u64(), 0x1160af22a66abc3c);
assert_eq!(sample.read_u64(), 0x8c2599d9418d287c);
// ... and so onSourcepub fn set_seed(self: &mut ChaCha8Rand, seed: &[u8; 32])
pub fn set_seed(self: &mut ChaCha8Rand, seed: &[u8; 32])
Reset this generator, as if overwriting it with ChaCha8Rand::new(seed).
This helper method exists is because it’s more convenient sometimes and might avoid copying a relatively large type from one location to another.
§Examples
Restoring the original seed after a simulation to reproduce it:
fn run_simulation(rng: &mut ChaCha8Rand) -> String {
// TODO: figure out the details
format!("the meaning of life is {}", rng.read_u64())
}
let mut rng = ChaCha8Rand::new(&initial_seed);
let result = run_simulation(&mut rng);
println!("Simulation says {result} - let's try to replicate that");
rng.set_seed(&initial_seed);
let result_again = run_simulation(&mut rng);
assert_eq!(result, result_again);Sourcepub fn read_u32(&mut self) -> u32
pub fn read_u32(&mut self) -> u32
Consume four bytes of uniformly random data and return them as u32.
This is always equivalent to ChaCha8Rand::read_bytes plus u32::from_le_bytes, but 99%
of the time it’s more efficient. If you simply need 32 or fewer uniformly random bits, this
method enables this conveniently and without involving the rand_* crates.
On the other hand, if you want integers in a range like 0..n or m..=n, you should not
use this method and combine it with the remainder operator %. The rand crate has
convenient and efficient APIs for doing that correctly, without introducing bias. It also
supports more data types, non-uniform distributions, and higher-level operations such as
shuffling lists. You can use it with ChaCha8Rand by activating the crate
feature so that ChaCha8Rand implements the rand traits. See the examples
for more details.
§Examples
To generate integers in some range 0..n or 0..=n, or to generate other types such as
floating point numbers, combine ChaCha8Rand with the rand crate (or another
implementation of the same algorithms).
// This example is not tested automatically because it doesn't
// compile when the `rand_core_0_9` feature is disabled.
use chacha8rand::ChaCha8Rand; // with rand_core_0_9 feature
use rand::prelude::*; // rand version 0.9
let mut rng = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
if rng.gen_ratio(2, 3) {
println!("Nice weather we're having :)");
} else {
println!("Awful weather, isn't it?");
}
let chan = rng.gen_range(1..100);
let celsius = rng.gen_range(-5.0..=35.0);
println!("Channel {chan} News said it'll be {celsius:.1} degrees tomorrow.");Taking the remainder modulo n to get an integer in 0..n should be avoided because it
introduces bias when n is not a power of two. This is easier to see when you try all the
possibilities with a smaller number of bits than 32, e.g., with five bits and three options:
let mut remainder_histogram = [0, 0, 0];
for five_bit_number in 0..(1 << 5) {
remainder_histogram[five_bit_number % 3] += 1;
}
assert_eq!(remainder_histogram, [11, 11, 10]);In this example, the results 0 and 1 each have probability 11 / 32 = 34.375% and the result 2 has probability 10 / 32 = 31.25% instead of the desired 33.333..% for each.
It may appear that the bias becomes very small when you use 32 bits instead of just five,
but it can still cause problems at larger scales. Consider a scenario where you choose among
n = (u32::MAX / 3) * 2 (ca. 2.86 billion) items via read_u32() % n. For any given item,
the odds of being chosen are already very small, with or without the bias. However, if you
choose a few hundred items and look at them as a whole, you’ll notice that roughly half of
them are from the first third of the range 0..n, and the other half are spread out
across the rest of the range.
More fully featured libraries like rand implement sampling algorithms that avoid this
problem. They’re also usually more efficient than computing the remainder, which is a
relatively expensive operation even on modern CPUs.
At other times, 32 bits is exactly what you need. For example, tabulation
hashing with 32 bit output needs a table of random 32 bit integers. To hash a 64
bit value this way, we could split it into 16 pieces of 4 bits each and use a table of 16 x
2^4 random u32s:
struct U64Hasher {
table: [[u32; 16]; 16],
}
impl U64Hasher {
fn new(rng: &mut ChaCha8Rand) -> Self {
let mut table = [[0; 16]; 16];
for row in &mut table {
for cell in row {
*cell = rng.read_u32();
}
}
Self { table }
}
fn hash(&self, mut value: u64) -> u32 {
let mut hash = 0;
for row in &self.table {
hash ^= row[(value & 0xF) as usize];
value >>= 4;
}
debug_assert_eq!(value, 0);
hash
}
}
let mut rng = ChaCha8Rand::new(&seed);
let hasher = U64Hasher::new(&mut rng);
assert_ne!(hasher.hash(1 << 32), hasher.hash(1 << 36));Sourcepub fn read_u64(&mut self) -> u64
pub fn read_u64(&mut self) -> u64
Consume eight bytes of uniformly random data and return them as u64.
This is always equivalent to ChaCha8Rand::read_bytes plus u64::from_le_bytes, but 99%
of the time it’s more efficient. If you simply need 64 or fewer uniformly random bits, this
method enables this conveniently and without involving the rand_* crates.
As discussed in the the 32-bit variant, you can and should use
ChaCha8Rand with the rand crates for bounded integers in a range such as 0..n or
m..=n, to generate floating-point numbers and sample non-uniform distributions, to shuffle
lists, and so on.
§Examples
With 64 bits, we can generate a 8x8 bitmap and render it as ASCII art. Clearly that’s much better than a smaller (and non-square) bitmap with only 32 “pixels”.
let mut rng = ChaCha8Rand::new(&seed);
let mut bitmap = rng.read_u64();
for _row in 0..8 {
for _column in 0..8 {
let pixel = ['X', '.'][(bitmap & 1) as usize];
print!("{pixel}");
bitmap >>= 1;
}
println!();
}A more computer science-minded example would be strongly universal hashing of
32-bit integers into 32 or fewer bits. The strongly universal multiply-shift scheme by
Dietzfelbinger needs two random, independent 64-bit parameters a and b:
struct MulAddShift {
a: u64,
b: u64,
}
impl MulAddShift {
fn hash(&self, x: u32) -> u32 {
(u64::from(x).wrapping_mul(self.a).wrapping_add(self.b) >> 32) as u32
}
}
let mut rng = ChaCha8Rand::new(b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456");
let h = MulAddShift {
a: rng.read_u64(),
b: rng.read_u64()
};
// Truncating the outputs to two bits also gives a strongly universal family.
// Strong universality implies uniformity - all hash values are equally likely.
assert_eq!(h.hash(0) % 4, 2);
assert_eq!(h.hash(1) % 4, 0);
assert_eq!(h.hash(2) % 4, 3);
assert_eq!(h.hash(3) % 4, 1);It’s just a happy coincidence that we got every two-bit output exactly once in this small example, with this exact seed and these exact inputs. To be honest, I was a bit surprised that it worked out so perfectly.
Sourcepub fn read_bytes(&mut self, dest: &mut [u8])
pub fn read_bytes(&mut self, dest: &mut [u8])
Consume uniformly random bytes and write them into dest.
This method is, in some sense, the most foundational way of using the generator. Other
read_* methods behave as-if they read however many bytes they require, but they’re more
convenient and often more efficient than reading a small number of bytes manually.
On the other hand, when you need a large chunk of randomness (hundreds of bytes or more),
reading into a large buffer is very efficient because it boils down to one or several
memcpys from the generator’s internal buffer. With a large enough buffer, this can produce
several gigabytes per second with 128-bit SIMD, and the AVX2 backend goes roughly twice as
fast. (For read_u32 and read_u64, the difference is much more modest.)
§Example
You can use this to derive a new 32-byte seed for another ChaCha8Rand instance, but the
ChaCha8Rand::read_seed helper makes this more convenient, so see examples there.
Other use cases require more or fewer bytes. For example, random (v4) UUIDs are great for
assigning arbitrary names to objects or events while avoiding collisions with other people
(or their computers) doing the same thing. In most cases, you should generate a UUID
directly from OS-provided entropy, which the uuid crate supports with
Uuid::new_v4(). But in some cases it’s more convenient to get a high-entropy seed from the
OS at startup and feed it into a high-quality userspace RNG to create lots of UUIDs:
use uuid::Uuid;
// For this use case, you *always* need a high-entropy seed!
// A low-entropy seed (current time, chosen by humans, etc.), reusing a seed,
// or cloning the generator leads to many colliding "UUIDs".
let mut seed = [0; 32];
getrandom::fill(&mut seed).unwrap();
let mut rng = ChaCha8Rand::new(&seed);
let mut uuid_bytes = [0u8; 16];
rng.read_bytes(&mut uuid_bytes);
let uuid = Uuid::from_bytes(uuid_bytes);Sourcepub fn read_seed(&mut self) -> [u8; 32]
pub fn read_seed(&mut self) -> [u8; 32]
Consume 32 uniformly random bytes, suitable for seeding another RNG instance.
This is a simple wrapper around read_bytes, but returning an array by value is convenient
when you want to use it as a seed. The ChaCha8Rand algorithm already replaces its seed with
some of its earlier output after every iteration (992 bytes of output + 32 bytes of new
seed). In this sense, it’s not even that strange to use the other output in the same way.
There’s usually no point in manually re-seeding the same generator instance, but it’s
often useful to derive several independent generators from a “root” seed.
Of course, the seed could also be used with any other PRNG algorithm that accepts 32-byte
seeds. However, most generators that accept [u8; 32] as seed also use stream or block
ciphers to generate large batches of data. There are some statistical generators that happen
to have exactly 32 bytes of state, but these usually want 32- or 64-bit integers instead of
raw bytes. In that case you might just call read_u32 or read_u64 a few times.
§Examples
In general, having multiple generator instances is useful when you want some domain separation. A Monte Carlo simulation that should be reproducible from a seed might also rely on some Las Vegas algorithms for auxillary tasks that don’t affect the simulation’s outcome (e.g., randomized quicksort or building a perfect hash table). If everything shares a single generator, any change to how much (or when) randomness is consumed will affect reproducibility of the simulation. On the other hand, debugging may be easier if the rest of the program also depends on the seed. In such cases, you could stretch a “root” seed into multiple seeds (without involving additional algorithms such as key derivation functions):
let root_seed = b"ABCDEFGHIJKLMNOPQRSTUVWXYZ123456";
let mut root_rng = ChaCha8Rand::new(root_seed);
let mut sim_rng = ChaCha8Rand::new(&root_rng.read_seed());
let mut qsort_rng = ChaCha8Rand::new(&root_rng.read_seed());
drop(root_rng); // no longer needed
let steps = sim_rng.read_u32();
for _ in 0..steps {
// ... generate some data using `sim_rng` ...
quicksort_with_random_pivot(
&mut intermediate_results,
&mut qsort_rng,
);
// ... use the sorted data ...
}Another example comes from roguelike games that use a single seed to drive procedural generation as well as the blow-by-blow game play. If one seeds leads to particularly interesting outcomes, players may want to share and reuse it. This falls flat if any change in how much randomness is consumed avalanches into entirely different outcomes down the line. For example, if parts of the map are generated on the fly, they might look entirely different depending on how many turns the player took to reach them. Instead, you can set up one generator for every aspect of the game’s randomness, and derive seeds for all of them from the seed that the player deals with. Just make sure you don’t accidentally use the same seed for each of the generators.
struct GameStateGodObject {
map_rng: ChaCha8Rand,
encounter_rng: ChaCha8Rand,
ai_rng: ChaCha8Rand,
event_rng: ChaCha8Rand,
// ...
}
impl GameStateGodObject {
fn new(seed: [u8; 32]) -> Self {
let mut root = ChaCha8Rand::new(&seed);
let mut derive_rng = || ChaCha8Rand::new(&root.read_seed());
Self {
map_rng: derive_rng(),
encounter_rng: derive_rng(),
ai_rng: derive_rng(),
event_rng: derive_rng(),
// ...
}
}
}Sourcepub fn clone_state(&self) -> ChaCha8State
pub fn clone_state(&self) -> ChaCha8State
Take a snapshot of the generator’s current state.
See ChaCha8State for more details and an example.
Sourcepub fn try_restore_state(
&mut self,
state: &ChaCha8State,
) -> Result<(), RestoreStateError>
pub fn try_restore_state( &mut self, state: &ChaCha8State, ) -> Result<(), RestoreStateError>
Restore the generator’s state from a snapshot taken before.
See the documentation of ChaCha8State for more details and an example.
§Errors
This function never fails if state came from ChaCha8Rand::clone_state and was not
modified. Otherwise (e.g., if you deserialize it from a file that someone fiddled with), it
may fail because the bytes_consumed field is out of range. This field refers to a single
iteration of ChaCha8Rand, which always produces 992 bytes of output. Thus, valid values are
in the range 0..=992.
Trait Implementations§
Source§impl Clone for ChaCha8Rand
impl Clone for ChaCha8Rand
Source§fn clone(&self) -> ChaCha8Rand
fn clone(&self) -> ChaCha8Rand
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more