malstrom 0.1.0

Malstrom is a distributed, stateful stream processing framework written in Rust.
Documentation

What is Malstrom?

Malstrom is a distributed, stateful stream processing framework written in Rust. In usage it is similar to Apache Flink or bytewax, although implemented fundamentally differently. Malstrom's goal is to offer best-in-class usability, reliability and performance, enabling everyone to build fast parallel systems with unparalleled up-time.


Distributed: Malstrom can run on many machines in parallel, sharing the processing workload and enabling zero-downtime scaling to fit any demand. Kubernetes is supported as a first-class deployment environment, others can be added through a public trait interface.

Stateful: Processing jobs can hold arbitrary state, which is snapshotted regularly to persistent storage like disk or S3. In case of failure or restarts, the job resumes from the last snapshot. Malstroms utilizes the ABS Algorithm, ensuring every message affects the state exactly once.

Usability: Malstrom provides a straight-forward dataflow API, which can be extended when needed. A simple threading model means no async, no complex lifetimes, no Send or Sync needed. Data only needs to be serialisable when explicitly send to other processes.

Reliability: Using the world's safest programming language makes building highly-reliable stream processors a breeze. In any case zero-downtime scaling and zero-downtime upgrades (TBD) allow for awesome uptime.

Code Example

//! Stream processing can be easy!
use malstrom::operators::*;
use malstrom::runtime::MultiThreadRuntime;
use malstrom::sinks::{StatelessSink, StdOutSink};
use malstrom::snapshot::NoPersistence;
use malstrom::sources::{SingleIteratorSource, StatelessSource};
use malstrom::worker::StreamProvider;

fn main() {
    MultiThreadRuntime::builder()
        .persistence(NoPersistence)
        .parrallelism(1)
        .build(build_dataflow)
        .execute()
        .unwrap();
}

fn build_dataflow(provider: &mut dyn StreamProvider) {
    provider
        .new_stream()
        .source(
            "words",
            StatelessSource::new(SingleIteratorSource::new([
                "Look",
                "ma'",
                "I'm",
                "streaming",
            ])),
        )
        .map("upper", |x| x.to_uppercase())
        .sink("stdout", StatelessSink::new(StdOutSink));
}

This outputs

{ key: NoKey, value: "LOOK", timestamp: 0 }
{ key: NoKey, value: "MA'", timestamp: 1 }
{ key: NoKey, value: "I'M", timestamp: 2 }
{ key: NoKey, value: "STREAMING", timestamp: 3 }

How production ready is Malstrom

Not very, all the documented functionality works but as of version 0.1.0 Malstrom is more a proof of concept than a production ready streaming framework.

Why is it called Malstrom?

"Malstrom" is the German name for the Moskstraumen one of the strongest and fastest tidal currents in the world.