Orlando
A virtual actor framework in Rust, inspired by Microsoft Orleans.
What is a virtual actor?
Traditional actors (Erlang, Akka) require you to manually create, manage, and destroy actor instances. Virtual actors flip this: every actor conceptually always exists. You never create or destroy one — you just talk to it by identity, and the runtime handles the rest.
- Automatic lifecycle — A grain (virtual actor) is activated the first time someone sends it a message. After sitting idle, the runtime deactivates it to free resources. If someone talks to it again later, it reactivates transparently.
- Single-threaded by design — Each grain processes exactly one message at a time. No mutexes, no data races, no locks. Your handler is a plain
async fnthat owns its state exclusively. - Location transparency — Callers address grains by type + key (e.g.
Counter/"room-42"), not by address. In a cluster, the runtime routes messages to the correct silo automatically.
This model was pioneered by Microsoft Orleans for building distributed systems like Halo's backend services. Orlando brings the same programming model to Rust.
Features
Core Runtime
- Turn-based execution — Each grain processes one message at a time via a mailbox loop. Handlers are
async fnwith exclusive&mut State. - Reentrant grains — Opt-in concurrent message dispatch via
#[grain(reentrant)]. Multiple handlers run concurrently, state access serialized by async mutex. - Stateless workers — Pool of identical grain instances for compute-heavy workloads. Round-robin dispatch via
#[grain(stateless_worker)]. - Typed grain references —
GrainRef<G>is a cheap, cloneable handle..ask(msg).awaitsends a message and returns the reply. - Grain call filters — Cross-cutting interceptors (logging, metrics, auth) on every
ask()call viaGrainCallFiltertrait. - Request context propagation — Key-value context (trace IDs, tenant IDs) flows automatically through grain-to-grain call chains, including cross-silo.
- Backpressure —
try_ask()fails immediately if the mailbox is full.mailbox_pressure()reports utilization (0.0–1.0).max_activationscaps total grains per silo. - Deadlock detection — Circular grain call chains (A calls B calls A) are detected and return
GrainError::DeadlockDetectedinstead of hanging. - Cancellation tokens — Handlers can check
ctx.is_cancelled()for cooperative shutdown during drain/rebalance. - Silo lifecycle hooks —
on_startup/on_shutdowncallbacks onSiloBuilder. - Proc macros —
#[grain],#[message],#[grain_handler]eliminate boilerplate.
Persistence
- Automatic state persistence — Grain state is loaded on activation and saved on deactivation via pluggable backends.
- Configurable persistence strategy —
WriteOnDeactivate(default),WriteThrough(save after every message),WriteBack(Duration)(periodic save). - Transactional grains — Automatic rollback on handler failure via
TransactionalGrainRef. - State versioning / migration —
VersionedGrainwith migration chains (v0 -> v1 -> v2) for schema evolution. - Event sourcing —
JournaledGrainappends events to a journal, replays on activation. Automatic snapshots. - Optimistic concurrency — ETags on persisted state detect concurrent writes (
EtagMismatcherror). - Backends —
InMemoryStateStore,FileStateStore,SqliteStateStore,PostgresStateStore,RedisStateStore.
Timers and Reminders
- Volatile timers — Periodic messages into a grain's mailbox. Cancelled on deactivation or handle drop.
- Durable reminders — Persisted schedules that survive restarts.
InMemoryReminderStoreandSqliteReminderStore.
Clustering
- gRPC transport — Silo-to-silo communication via tonic. Dual encoding: bincode (internal) + protobuf (external clients).
- Consistent hashing — Deterministic grain placement via FNV-1a hash ring with configurable virtual nodes.
- SWIM failure detection — Suspicion protocol with direct pings, indirect pings, configurable timeouts, and gossip piggybacking.
- Automatic rebalancing — Grains migrate gracefully on node join/leave (on_deactivate runs, state persists).
- Distributed grain directory — Cluster-wide activation lookup prevents duplicate activations during ring transitions.
- Gateway forwarding — Any silo can accept a grain call and route it to the correct owner transparently.
- Placement strategies —
HashBasedPlacement(default),PreferLocalPlacement,RandomPlacement. Per-grain hints via#[grain(placement = "prefer_local")]. - Message versioning — Versioned message types for safe rolling deploys across silos.
- Retry policy — Configurable exponential backoff on transient remote call failures. Application errors never retried.
- TLS and authentication —
ServerTlsConfig/ClientTlsConfigfor encrypted transport. PluggableClusterAuthtrait withSharedSecretAuthincluded. - Service discovery —
MembershipProvidertrait withStaticSeedProviderandDnsMembershipProvider(Kubernetes headless services).
Multi-Cluster
- Global Single Instance (GSI) — Cross-cluster directory ensures one activation per grain globally. Epoch-based CAS fencing prevents split-brain.
- Cross-cluster forwarding — Grain calls transparently routed to the owning cluster via gRPC gateway.
- Replication — Primary streams state to secondaries via
ReplicationLog+ReplicationSink.ReplicaStoreserves stale reads within configurable staleness. - Failover —
FailoverManagermonitors peer health, promotes grains via epoch increment on cluster failure. Graceful drain notifications skip grace period. - Data residency — Pin grain types to specific clusters. Transport layer enforces constraints automatically.
Observability
- Metrics —
MetricsFilterrecordscalls_total,call_duration_seconds,errors_totalper grain type.activations_activegauge. Uses themetricscrate (backend-agnostic — wire in Prometheus, Datadog, etc.). - Structured tracing — Every activation, deactivation, message dispatch, and failure logged via
tracing. - Health endpoints —
ClusterSiloBuilder::health_port(p)exposesGET /healthz(liveness) andGET /readyz(readiness, with optional store probe) for Kubernetes probes.
Prometheus exporter example
A runnable example wires MetricsFilter to a Prometheus scrape endpoint:
# in another shell:
|
It installs metrics-exporter-prometheus as the global recorder, builds a Silo with MetricsFilter, and drives a counter grain to emit orlando_grain_* series.
External Clients
- Client SDK —
orlando-clientcrate for non-silo processes. Connects to any silo, discovers the cluster, routes via local hash ring. Typed (bincode) and untyped (protobuf) message support. Automatic retry with membership refresh on stale ring.
Crate Layout
| Crate | Purpose |
|---|---|
orlando-core |
Grain/Message/GrainHandler traits, mailbox loop, filters, observers, streams, request context, cancellation |
orlando-runtime |
Silo, grain directory, activation management, metrics filter, lifecycle hooks |
orlando-macros |
#[grain], #[message], #[grain_handler] proc macros |
orlando-persistence |
Persistent/transactional/versioned/journaled grains, state stores, ETags |
orlando-timers |
Volatile timers and durable reminders |
orlando-cluster |
Multi-silo clustering, gRPC transport, SWIM, placement, TLS, auth, retry, discovery, multi-cluster geo-replication |
orlando-client |
External client SDK for non-silo processes |
Quick Start
use GrainContext;
use ;
use Silo;
;
;
async
async
async
Persistence
use ;
let store = new.await?;
let silo = builder.store.build;
// Write-on-deactivate (default)
let counter = silo.;
// Write-through (save after every message)
let counter = silo.;
Available backends: InMemoryStateStore, FileStateStore, SqliteStateStore, PostgresStateStore, RedisStateStore.
Clustering
use ;
let silo = builder
.host
.port
.silo_id
.
.
.auth
.auth_token
.retry_policy
.build;
spawn;
silo.join_cluster.await?;
// Calls are transparently routed to the owning silo
let counter = silo.;
counter.ask.await?;
External Clients
use OrlandoClient;
let client = connect.await?;
let counter = client.grain;
// Typed (Rust clients sharing message types)
let result: i64 = counter.ask.await?;
// Untyped (any language via protobuf)
let response_bytes = counter.ask_proto.await?;
Examples
Multi-Cluster / Geo-Replication
use ;
let multi_cluster = new
.peer;
let silo = builder
.host
.port
.silo_id
.multi_cluster
.failover_config
.
.build;
- Global Single Instance (GSI) -- One activation per grain across all clusters. Cross-cluster directory tracks ownership with epoch-based fencing.
- Cross-cluster forwarding -- Any cluster can accept a grain call and forward it to the owning cluster via gRPC gateway.
- Epoch-based failover -- When a cluster becomes unreachable,
FailoverManagerpromotes grains to healthy clusters via CAS with monotonically increasing epochs. Stale primaries are fenced out. - Replication -- Primary clusters stream grain state to secondaries via
ReplicationLog. Secondaries maintain aReplicaStorefor serving stale reads within a configurable staleness window. - Data residency -- Pin grain types to specific clusters via
#[grain(allowed_clusters = &["eu-west"])]. Enforced at the transport layer -- requests are forwarded to allowed clusters automatically. - Graceful drain --
DrainNotificationskips failover grace periods during planned shutdowns.
Orleans Feature Comparison
| Feature | Orleans | Orlando | Notes |
|---|---|---|---|
| Virtual actor model | Yes | Yes | Grains with identity-based addressing |
| Turn-based execution | Yes | Yes | Single-threaded mailbox loop |
| Reentrancy | At await points | Concurrent dispatch | Different model, same goal |
| Stateless workers | Yes | Yes | Pooled activations, round-robin |
| Persistent state | Yes | Yes | Pluggable backends, write-through/write-back |
| Transactional state | Yes | Single-grain | No distributed transactions |
| State versioning | Yes | Yes | Migration chains |
| Event sourcing | JournaledGrain | JournaledGrain | Append events, replay, snapshots |
| Timers | Yes | Yes | Volatile timers |
| Reminders | Yes | Yes | Durable, persist to SQLite |
| Observers / pub-sub | Yes | Yes | ObserverSet, fire-and-forget |
| Streaming | Yes | Yes | StreamProducer/StreamItem |
| Clustering | Yes | Yes | gRPC, consistent hashing, SWIM |
| Grain directory | Distributed | Cluster-wide lookup | Prevents duplicate activations |
| Failure detection | Yes | Yes | SWIM with suspicion, indirect pings |
| Placement strategies | Yes | Yes | Hash, prefer-local, random, per-grain hints |
| Gateway/forwarding | Yes | Yes | Any silo routes to owner |
| Call filters | Yes | Yes | Before/after interceptors |
| Request context | Yes | Yes | Cross-silo propagation |
| Deadlock detection | Yes | Yes | Call chain tracking |
| Metrics | Dashboard | metrics crate | No built-in dashboard |
| TLS | Yes | Yes | mTLS, server/client certs |
| Authentication | Yes | Yes | Pluggable trait, shared-secret included |
| Service discovery | Azure, K8s, Consul | DNS, static seeds | No cloud-specific providers yet |
| Retry policies | Yes | Yes | Exponential backoff, transient-only |
| Client SDK | Orleans.Client | orlando-client | Typed + protobuf |
| Multi-cluster | Yes | Yes | GSI directory, cross-cluster forwarding, epoch fencing |
| Geo-replication | Yes | Yes | Replication log, replica store, stale reads |
| Failover | Yes | Yes | Epoch-based CAS promotion, graceful drain |
| Data residency | Yes | Yes | Per-grain cluster pinning, transport-enforced |
| Grain extensions | Yes | No | |
| Distributed transactions | Yes | No | Single-grain only |
| Streaming providers | Kafka, EventHub | In-process only | No external stream adapters yet |
| Dashboard UI | Yes | No | Use Prometheus + Grafana |
Testing
License
MIT