# DScale
[](https://crates.io/crates/dscale)
[](LICENSE)
[](https://docs.rs/dscale)
A fast, deterministic simulation framework for testing and benchmarking distributed systems. It simulates network latency, bandwidth constraints, and process execution in an event-driven environment with support for both single-threaded and parallel execution modes.
## Usage
### 1. Define Messages
Messages must implement the `Message` trait, which allows defining a `virtual_size` for bandwidth simulation.
```rust
use dscale::*;
struct MyMessage {
data: u32,
}
impl Message for MyMessage {
fn virtual_size(&self) -> usize {
// Size in bytes used for bandwidth simulation.
// Can be much bigger than real memory size to simulate heavy payloads.
1000
}
}
// Or (if there is no need in bandwidth)
impl Message for MyMessage {}
```
### 2. Implement Process Logic
Implement `Process` to define how your process reacts to initialization, messages, and timers.
```rust
use dscale::*;
#[derive(Default)]
struct MyProcess;
impl Process for MyProcess {
fn on_start(&mut self, _seed: Seed) {
schedule_timer_after(Jiffies(100));
}
fn on_message(&mut self, from: Pid, message: MessagePtr) {
if let Some(msg) = message.try_as_type::<MyMessage>() {
dscale_debug!("Received message from {}: {}", from, msg.data);
}
}
fn on_timer(&mut self, _id: TimerId) {
broadcast(MyMessage { data: 42 });
}
}
```
### 3. Run the Simulation
Use `SimulationBuilder` to configure the topology, network constraints, and start the simulation.
```rust
use dscale::*;
fn main() {
let mut runner = SimulationBuilder::default()
.add_pool::<MyClient>("Client", 1)
.add_pool::<MyServer>("Server", 3)
.default_latency(Distr::Uniform{low: Jiffies(1), high: Jiffies(5)})
.between_pool_latency("Client", "Server", Distr::Normal {
mean: Jiffies(10),
std_dev: Jiffies(2),
low: Jiffies(5),
high: Jiffies(20),
})
.vnic_bandwidth(BandwidthConfig::Bounded{inbound: 1000, outbound: 1000})
.time_budget(Jiffies(1_000_000))
.name("My simulation optional name")
.seq_sched()
.build();
runner.run_full_budget();
}
```
#### Parallel Execution
For large simulations, enable parallel execution to distribute process steps across multiple threads:
```rust
let mut runner = SimulationBuilder::default()
.add_pool::<MyProcess>("Nodes", 1000)
.within_pool_latency("Nodes", Distr::Uniform{low: Jiffies(1), high: Jiffies(10)})
.time_budget(Jiffies(1_000_000))
.par_sched(ThreadNumber::Specific(8)) // use 8 worker threads
.build();
runner.run_full_budget();
```
When is the parallel scheduler efficient?
1. A lot of simulated processes (at least 200-300)
2. on_message/on_timer execution takes most of the simulation time
3. Independent work inside on_message/on_timer handlers (not so much synchronization)
#### Fault Injection
DScale supports injecting network faults into simulations. Faults are scheduled as events — you specify when a fault starts and when it ends.
> [!CAUTION]
> Fault injection is only supported in single-threaded (`sequential`) mode. Using faults with `parallel` mode will panic at build time.
```rust
let mut runner = SimulationBuilder::default()
.add_pool::<MyProcess>("Nodes", 5)
.within_pool_latency("Nodes", Distr::Uniform{low: Jiffies(1), high: Jiffies(5)})
.time_budget(Jiffies(1_000_000))
// Break the link between pid 0 and pid 1 from time 100 to 500
.break_link(Jiffies(100), Jiffies(500), 0, 1)
// Isolate pid 2 (all links broken) from time 200 to 800
.isolate(Jiffies(200), Jiffies(800), 2)
.seq_sched()
.build();
runner.run_full_budget();
```
| `break_link(start, end, pid1, pid2)` | Breaks the link between two pids for the given time interval |
| `isolate(start, end, pid)` | Isolates a pid (breaks all its links) for the given time interval |
## Public API
### Simulation Control
**`SimulationBuilder`** — Configures the simulation environment.
| `default` | Creates simulation with no processes and default parameters |
| `seed` | Sets the random seed for deterministic execution |
| `time_budget` | Sets the maximum simulation duration |
| `add_pool` | Creates a named pool of processes (all processes also join `GLOBAL_POOL`) |
| `default_latency(distribution)` | Configures default latency distribution which will be used unless configured other distribution explicitly |
| `within_pool_latency(pool, distribution)` | Configures latency between processes within a pool |
| `between_pool_latency(pool_a, pool_b, distribution)` | Configures latency between two pools (symmetric). Every pool pair must have latency configured before calling `build` |
| `vnic_bandwidth` | Configures per-process network bandwidth limits for "virtual" NIC. `Bounded{usize,usize}`: limits bandwidth (bytes per jiffy). `Unbounded`: no bandwidth limits (default) |
| `seq_sched` | Selects single-threaded execution (default). Mutually exclusive with `par_sched` — calling both panics |
| `par_sched(threads)` | Selects parallel execution with the given number of worker threads. Mutually exclusive with `seq_sched` — calling both panics |
| `break_link(start, end, pid1, pid2)` | Breaks the link between two pids for the given time interval. Requires `FAULT=on`. sequential mode only |
| `isolate(start, end, pid)` | Isolates a pid (breaks all its links) for the given time interval. Requires `FAULT=on`. sequential mode only |
| `name` | Gives simulation instance the name |
| `build` | Finalizes configuration and returns a simulation runner |
**`SimulationRunner`**
| `run_full_budget` | Runs the simulation until the time budget is exhausted |
| `run_steps` | Runs the simulation until it performs the requested number of steps or the global budget is exhausted |
| `run_sub_budget` | Runs the simulation until the sub-budget starting from current time point or global budget are exhausted |
### Network Topology
**`Constants`**
| `GLOBAL_POOL` | Implicit pool containing all processes. `broadcast` uses this pool |
**`Distributions`**
| `Uniform {low, high}` | Uniform distribution over `[low, high]` |
| `Bernoulli {p, value}` | With probability `p` the latency is `value`, otherwise 0 |
| `Normal {mean, std_dev, low, high}` | Truncated normal distribution clamped to `[low, high]` |
| `Pareto {scale, shape}` | Pareto distribution |
### Process Interaction (Context-Aware)
These functions are available globally but must be called within the context of a running process step.
| `broadcast` | Shortcut for `broadcast_within_pool(GLOBAL_POOL)` |
| `broadcast_within_pool` | Sends a message to all processes within a named pool |
| `send_to` | Sends a message to a specific process by pid |
| `send_random` | Shortcut for `send_random_from_pool(GLOBAL_POOL)` |
| `send_random_from_pool` | Sends a message to a random process within a named pool |
| `schedule_timer_after` | Schedules a timer for the current process, returns a `TimerId` |
| `pid` | Returns the pid of the currently executing process (pids start at 0) |
| `now` | Returns the current simulation time |
| `list_pool` | Returns a vector of all processes pids in a pool |
| `choose_from_pool` | Picks a random process pid from a named pool |
| `unique_id` | Generates a globally unique monotonic ID |
### Key-Value Store (`dscale::services::kv`)
Thread-safe store for passing shared state, metrics, or configuration between processes or back to the host.
> [!WARNING]
> (1) You can't call these functions from a custom Default trait definition for your process. If it is the case, use the on_start handler. (2) High modify load on the same key(s) may introduce high contention and performance reduction when used with the parallel scheduler.
| `set(key, value)` | Stores a value under the given key |
| `get(key) -> T` | Retrieves a clone of the value (panics if missing or wrong type) |
| `modify(key, f)` | Mutates the value in place (panics if missing or wrong type) |
### Macros
All logging macros prefix output with the current simulation time and process pid (`[Now: ... | P...]`). Controlled by the `RUST_LOG` environment variable.
| `dscale_trace!` | Logs at **trace** level |
| `dscale_debug!` | Logs at **debug** level |
| `dscale_info!` | Logs at **info** level |
| `dscale_warn!` | Logs at **warn** level |
| `dscale_error!` | Logs at **error** level |
### Helpers (`dscale::helpers`)
| `Combiner` | Collects values until a threshold is reached, then yields them all at once. Useful for quorum-based logic |
### Message Downcasting (`MessagePtr`)
| `try_as_type::<T>()` | Attempts to downcast to `T`, returns `Option<&T>` |
| `as_type::<T>()` | Downcasts to `T`, panics if the type does not match |
| `is::<T>()` | Returns `true` if the message is of type `T` |
## Logging Configuration (`RUST_LOG`)
DScale output is controlled via the `RUST_LOG` environment variable.
- **`RUST_LOG=[some_level]`**: Enables all `dscale_[level <= some_level]!` macros output.
- **`RUST_LOG=full::path::to::your::file::or::crate=[level],another::path=[level]`**: Filter events only for your specific file or crate.
> [!WARNING]
> `RUST_LOG=[level > info]` only work without the `--release` flag.
## Examples
You can find usage examples [here](https://codeberg.org/kshprenger/dscale/src/branch/main/examples)
## Paper
You can find paper describing algorithms behind dscale [here](https://codeberg.org/kshprenger/dscale/src/branch/main/paper.pdf)
## Thanks to
- https://gitlab.com/whirl-framework
- https://github.com/jepsen-io/maelstrom
- https://github.com/systems-group/anysystem
- https://www.nsnam.org
- https://omnetpp.org
- https://peersim.sourceforge.net