Expand description
§Datum
🪼 - Just “Stream” anything :]
Datum is a Rust stream processing library scaffolded around Ractor actors and a push-based stream abstraction. The compatibility target is Akka/Pekko Streams Typed API shape and behavior, with Rust-native ownership, async, and benchmarking constraints.
§Install
Add Datum to your project from crates.io:
cargo add datum-coreOr in Cargo.toml:
[dependencies]
datum-core = "0.4"Note: The package is published as
datum-core(the namedatumis taken on crates.io), but the import path staysuse datum::…— the crate’s library name isdatum.
To track an unreleased commit instead:
[dependencies]
datum-core = { git = "https://github.com/Aethergrids/Datum", tag = "v0.4.0" }§Upstream References
- Ractor crate:
0.15.13 - Optional Ractor cluster crate:
0.15.13, enabled with--features cluster - Akka source submodule:
third_party/akka, tracking upstreammainand pinned in this repository at commit58f1f6db2e505e87f5dc115ee9476833872e7ae0 - Latest stable Akka tag observed during setup:
v2.10.18
§Feature highlights (v0.4.0)
- Graph cycles —
MergePreferred/Broadcastfeedback loops supported; unbuffered cycles surfaceEventLimitExceededinstead of hanging. - StreamRefs —
StreamRefs::source_ref()/sink_ref(), Ractor-backed one-shot streaming handles for actor/process boundary crossing. - IO adapters —
StreamConverters::as_input_stream/as_output_streambridging Datum streams tostd::io::Read/Write. - Performance — the typed graph executor now covers all major junction shapes and cyclic feedback; nearly every benchmarked hot path meets or beats warmed Akka/Pekko (see benchmark tables linked below for honest per-row numbers, including below-parity rows).
#![forbid(unsafe_code)]on thedatumcrate itself; new safe depscrossbeam-queueandarc-swaphandle lock-free queues and hub snapshots without adding unsafe to the crate.
§Development
cargo test
cargo check --benches
cargo bench --bench push_baseline
cargo bench --bench source_flow
cargo bench --bench materialization
cargo bench --bench graph
cargo bench --bench actor_ask
benches/actor_ask_compare/run.shuse datum::{Sink, Source};
let sum = Source::from_iter(0_u64..1_000)
.map(|item| item + 1)
.filter(|item| item % 2 == 0)
.run_with(Sink::fold(0_u64, |acc, item| acc + item))
.unwrap()
.wait()
.unwrap();§Benchmarks
Datum is benchmarked head-to-head against warmed Akka/Pekko Streams across seven areas — Source/Flow,
materialization, graph/junctions, actor ask, dynamic streams, streaming IO, and substreams. As of
v0.4.0, Datum is at or above Akka on the large majority of benchmarked paths. Honest per-path numbers
(including remaining below-parity rows) are in the result tables under roadmap/benchmarks/. The
harness adds a Datum CPU us/op column on purpose: some wins come partly from busy-spinning while
Akka parks — a real cost the wall-clock number hides.
Qualitative summary of the current state:
- Fused linear path: 13–44x Akka (typed plan; all common sink shapes).
- Junctions: 1–340x, with most shapes well above parity; typed kernels now cover concat, merge, broadcast/zip, balance/merge, partition, prioritized merge, merge-preferred, merge-sorted, merge-sequence, and merge-latest.
- Graph cycles (typed feedback kernel): ~29x Akka on the
MergePreferred/Broadcastfeedback shape. - BidiFlow: 3.7–5x Akka.
- Dynamic hubs: MergeHub p16 13x, p4 2.1x (p1 below parity — remaining lever documented in
roadmap/benchmarks/dynamic-streams.md); BroadcastHub 1.3–2x; PartitionHub 1.4–2.7x. - Bounded queue: ~parity (0.96x wall) with ~112x less allocation than Akka.
- map_async: ≥2x at p4/p32 (Tokio dispatch).
- flat_map_merge: 1.4–1.8x.
- Streaming IO adapters: ~7.4x on the round-trip scenario.
- StreamRefs: 0.24–0.26x (Ractor-bound; levers documented).
- Actor ask: ~1.3x at p1; allocation dominated by upstream Ractor boxing (~848 B/element, upstream-blocked).
- Materialization: at parity.
Run a comparison (requires a JDK + sbt); each runner writes a rendered table under target/<area>-comparison/:
benches/source_flow_compare/run.sh
benches/materialization_compare/run.sh
benches/graph_compare/run.sh
benches/actor_ask_compare/run.sh
benches/dynamic_streams_compare/run.sh
benches/streaming_io_compare/run.sh
benches/substreams_compare/run.shCurrent result tables, the per-operator coverage matrix, and apples-to-apples caveats:
roadmap/benchmarks/source-flow.mdroadmap/benchmarks/materialization.mdroadmap/benchmarks/graph.mdroadmap/benchmarks/actor-ask.mdroadmap/benchmarks/dynamic-streams.mdroadmap/benchmarks/streaming-io.mdroadmap/benchmarks/substreams.mdroadmap/M1-v0.1.0-foundation.md— coverage matrix, optimization status & apples-to-apples caveatsroadmap/M4-v0.4.0-completeness-hardening.md— M4 work packages and exit criteria
See roadmap/ for the full milestone roadmap. User documentation is the published VitePress site (Cloudflare Pages); docs/ holds its source. roadmap/ holds internal planning and benchmark records.
Re-exports§
pub use actor::ActorFlow;pub use actor::ActorPubSub;pub use actor::ActorSink;pub use actor::ActorSinkBackpressureMessage;pub use actor::ActorSinkMessage;pub use actor::ActorSource;pub use actor::ActorSourceMessage;pub use actor::ActorStatus;pub use actor::ReplyPort;pub use actor::ReplySendError;pub use actor::SinkRef;pub use actor::SourceRef;pub use actor::StreamRefSettings;pub use actor::StreamRefs;pub use actor::WatchEvent;pub use context::FlowWithContext;pub use context::SourceWithContext;pub use dynamic::BroadcastHub;pub use dynamic::BroadcastHubConsumerSource;pub use dynamic::KillSwitches;pub use dynamic::MergeHub;pub use dynamic::MergeHubDrainingControl;pub use dynamic::PartitionConsumerInfo;pub use dynamic::PartitionHub;pub use dynamic::PartitionHubConsumerSource;pub use dynamic::UniqueKillSwitch;pub use graph::AnyInlet;pub use graph::AnyOutlet;pub use graph::AsyncBoundary;pub use graph::AsyncBoundaryExecutionConfig;pub use graph::Balance;pub use graph::BidiShape;pub use graph::Broadcast;pub use graph::Buffer;pub use graph::Concat;pub use graph::FanInShape;pub use graph::FanOutShape;pub use graph::FanOutShape2;pub use graph::FlowShape;pub use graph::FlowShape as GraphFlowShape;pub use graph::FusedExecutionConfig;pub use graph::FusedExecutionReport;pub use graph::FusedTerminalReport;pub use graph::Graph;pub use graph::GraphBlueprint;pub use graph::GraphBuilder;pub use graph::GraphDsl;pub use graph::GraphStage;pub use graph::GraphStageLogic;pub use graph::Identity;pub use graph::ImportedGraph;pub use graph::InHandler;pub use graph::Inlet;pub use graph::Interleave;pub use graph::MapStage;pub use graph::Merge;pub use graph::MergeLatest;pub use graph::MergePreferred;pub use graph::MergePreferredShape;pub use graph::MergePrioritized;pub use graph::MergeSequence;pub use graph::MergeSorted;pub use graph::OrElse;pub use graph::OutHandler;pub use graph::Outlet;pub use graph::PartialGraph;pub use graph::Partition;pub use graph::PortId;pub use graph::PortKind;pub use graph::SinkShape;pub use graph::SourceShape;pub use graph::StageSpec;pub use graph::TakeWhile;pub use graph::Unzip;pub use graph::UnzipWith;pub use graph::Zip;pub use graph::ZipShape;pub use io::Compression;pub use io::FileIO;pub use io::Framing;pub use io::FramingByteOrder;pub use io::InputStreamHandle;pub use io::IoResult;pub use io::OutputStreamHandle;pub use io::StreamConverters;pub use io::TcpBinding;pub use io::TcpConnection;pub use io::TcpIncomingConnection;pub use io::TokioByteSink;pub use io::TokioByteSource;pub use io::TokioFileIO;pub use io::TokioTcp;pub use queue::BoundedSourceQueue;pub use queue::QueueOfferResult;pub use queue::SinkQueue;pub use queue::SourceQueue;pub use stream::AggregateTimer;pub use stream::BidiFlow;pub use stream::Cancellable;pub use stream::DelayOverflowStrategy;pub use stream::Demand;pub use stream::Flow;pub use stream::Keep;pub use stream::Materializer;pub use stream::MaybeHandle;pub use stream::NotUsed;pub use stream::OverflowStrategy;pub use stream::PushOutlet;pub use stream::RestartFlow;pub use stream::RestartSettings;pub use stream::RestartSink;pub use stream::RestartSource;pub use stream::RetryFlow;pub use stream::RunnableGraph;pub use stream::Runtime;pub use stream::Sink;pub use stream::SinkCombineStrategy;pub use stream::Source;pub use stream::SourceCombineStrategy;pub use stream::StreamCompletion;pub use stream::StreamError;pub use stream::StreamResult;pub use stream::Supervision;pub use stream::SupervisionDecider;pub use stream::SupervisionDirective;pub use stream::ThrottleMode;
Modules§
- actor
- Actor interop APIs backed by Ractor for local execution.
- context
- Context-aware stream wrappers.
- dynamic
- Dynamic stream controls and attachment points modeled after Akka Streams.
- graph
- Runtime-checked graph, shape, port, and junction primitives.
- io
- Streaming I/O — file sources and sinks, TCP, byte framing, and compression.
- queue
- Bounded and unbounded queues that bridge external producers into a Datum stream.
- stream
- Core push-stream vocabulary and the first reusable Source/Flow API slice.
- testkit
- Stream testing probes built on Datum’s normal runtime.
Structs§
- Actor
Ref - An ActorRef is a strongly-typed wrapper over an ActorCell to provide some syntactic wrapping on the requirement to pass the actor’s message type everywhere.
- Attributes