arcon 0.2.0

A runtime for writing streaming applications
Documentation

Arcon - State-first Streaming Applications in Rust.

What is Arcon

Arcon is a library for building real-time analytics applications in Rust. The Arcon runtime is based on the Dataflow model, similarly to systems such as Apache Flink and Timely Dataflow.

Key features:

  • Out-of-order Processing
  • Event-time
  • Watermarks
  • Epoch Snapshotting for Exactly-once Processing
  • Hybrid Row(Protobuf) / Columnar (Arrow) System
  • Modular State Backend Abstraction

The Arcon philosophy is state first. Most other streaming systems are output-centric and lack a way of working with internal state with support for time semantics. Arcon's upcoming TSS query language allows extracting and operating on state snapshots consistently based on application-time constraints and interfacing with other systems for batch and warehouse analytics.

Disclaimer

Arcon is still in development and should be considered experimental!

The APIs may break and you should not be running Arcon with important data!

Example

use arcon::prelude::*;

let mut app = Application::default()
.iterator(0..100, |conf| {
conf.set_arcon_time(ArconTime::Process);
})
.filter(|x| *x > 50)
.to_console()
.build();

app.start();
app.await_termination();

Feature Flags

  • rocksdb

  • Enables RocksDB to be used as a Backend

  • metrics

  • Records internal runtime metrics and allows users to register custom metrics from an Operator

  • If no exporter (e.g., prometheus_exporter) is enabled, the metrics will be logged by the runtime.

  • hardware_counters

  • Enables counters like cache misses, branch misses etc.

  • It is to be noted that this feature is only compatible with linux OS as it uses perf_event_open() under the hood

  • One has to provide CAP_SYS_ADMIN capability to use it for eg: setcap cap_sys_admin+ep target/debug/collection , this takes the built file as an argument.

  • Not executing the above command will result into "Operation not permitted" error assuming the feature flag is enabled.

  • prometheus_exporter

  • If this flag is enabled , one can see the metrics using the prometheus scrape endpoint assuming there is a running prometheus instance.

  • One has to add a target to prometheus config:

  • job_name: 'metrics-exporter-prometheus-http' scrape_interval: 1s static_configs:

  • targets: ['localhost:9000']

  • allocator_metrics

  • With this feature on, the runtime will record allocator metrics (e.g., total_bytes, bytes_remaining, alloc_counter).

  • state_metrics

  • With this feature on, the runtime will record various state metrics (e.g., bytes in/out, last checkpoint size).