lucidity
A distributed execution engine built upon lunatic.
Motivation
Basically, lunatic by itself is a set of "low-level" features: runtime, syscalls, and language-wrappers.
However, the Process architecture is a bit harder to use when trying to keep code readable. This library provides a proc-macro, and, eventually, some helpers for common platforms like fly.io, to make it easier to write distributed code
on top of the excellent lunatic runtime.
Example
Here is a simple example below.
For each method you place the proc macro (lucidity::job) on, we generate a few others.
{name}_local, when called, spawns the function in a node localProcess, and blocks the callingProcess.{name}_remote, when called, spawns the function in aProcesson a random distributed node, and blocks the callingProcess.{name}_local_async, when called, spawns the function in a node localProcess, handing back a wrapped reference to theProcess, which can be polled, or blocked upon.{name}_remote_async, when called, spawns the function in aProcesson a random distributed node, handing back a wrapped reference to theProcess, which can be polled, or blocked upon.{name}_remote_fanout, which takes aVecof arg tuples and roundrobin distributes calls to that function with those arguments, polling all of theProcesses, and blocking until all are complete, returning aVecof the results.
The above example uses the lucidity::job proc macro to generate a few of those functions, and they can be "called" like any other function. The goal here is to use the excellent architecture of lunatic, while cutting down on some of the
boilerplate required to successfully write the distributed code. Setting up the Processes, and the Mailboxes, etc., is all handled for you.
The tradeoff is that this library is opinionated about how you write your code, and what you can do with it (open to suggestions, though). In addition, this library introduces some simple loops with timeouts to avoid possible deadlock,
which has some overhead.
Library Usage
First, install lunatic.
Add this to your Cargo.toml:
[]
= "*" # choose a version
In your .cargo/config.toml:
[]
= "wasm32-wasi"
[]
= "lunatic run"
Distributed Setup
To use this library in a distributed setup, you will need to do a few things. This example could easily be used locally as well, by just using the loopback address.
First, you need to run the control node somewhere.
And, on any other machines where you want the remote methods to run, you will need to set up nodes.
Local Testing
For testing, you would then build your code and run it inside a lunatic node. Something like this.
&&
Production Setup
In a more production setup, you would probably use something like fly.io to deploy your code (use the fly feature), and you may want
to build and run your code in a container. The easiest way is a simple docker container that runs the control node, and the application
node. Your entry point would look something like this.
NOTE: Due to UDP issues on fly.io, the "automatic" fly.io setup feature does not work, but will be enabled when I get the issues resolved.
#!bin/bash
&
Then, within your built wasm, you would span some nodes that would connect to the control node on other machines before running any of the distributed methods.
lunatic Primer
This library is built on top of lunatic, so it is important to understand the basics of lunatic before using this library.
Processes
lunatic is built around the concept of Processes. A Process is a lightweight thread of execution that is spawned
by the runtime. Each Process has its own stack, and is isolated from other Processes. Processes communicate
with each other via Mailboxes, which are essentially queues that can be used to send messages between Processes.
In the case of this library, you can totally use Processes directly, but the point of the lucidity library is to
make it easier to write distributed code, so we will focus on that.
Mailboxes
Mailboxes are the primary way that Processes communicate with each other. A Mailbox is a queue that can be used
to send messages between Processes. Each Process has a Mailbox that can be used to send messages to that Process.
For the purposes of this library, you don't need to worry about Mailboxes, as they are handled for you. However, it is
important to know that they exist since the "syntactic sugar" provided by this library abstracts away these mssage queues.
This is not like "async Rust", or any other "async/await" type languages. These Processes, and their Mailboxes, are
more like the coroutine or goroutine behavior of other languages.
As such, this library adds some overhead in the way it "feels" sort of like async Rust or blocking Rust, but it achieves that feel by using timeouts with wait loops. As this project is meant more for "fanning out" rigorous work to other nodes, this overhead is acceptable, but it is important to understand that this is not like "async Rust".
WASM
lunatic is built around the concept of WebAssembly (WASM). WASM is a binary format that is meant to be run in a sandboxed
environment. lunatic is able to scale so well to a distributed model because it relies on the concept that the "runtime"
ships with a WASM runtime, while that WASM code can make certain "runtime syscalls" for communication. The WASM abstracts
away the machine code such that each node can function properly with just the WASM from another node.
Theoretically, multiple nodes could each be initialized with their own WASM, and the lunatic runtime would be able to
spawn Processes on any of those nodes, as each node would send its WASM to the other nodes.
Remote Processes
Processes that are spawed remotely take advantage of the fact that your executable is WASM. Basically, lunatic
sends a copy of your WASM executable to the remote node, and then spawns a Process there, essentially using function
pointers to call the functions in your WASM executable. This is why you need to build your code as WASM, and why you
need to run the control node, and the application node, with the same executable.
However, you don't need to worry about getting your code onto other nodes. The lunatic runtime handles this automatically.
This also means that your "bare" functions "just work". That function is in the WASM, so if a process calls that function,
it will be called on the node where the process is running since that node has the WASM.
Pretty cool, right?
Examples
Let's look at a few examples to understand when you would use specific types of methods.
For all of these examples, we can assume that we have declared the square function like this.
"No Process"
Even if you mark a function with the lucidity::job proc macro, you can still call it like a normal function.
Local / Remote Process
If you want to spawn a process locally, you can use the {name}_local method.
Local / Remote Async Process
If you want more fine-grained control over when to block, you can use the {name}_local_async and {name}_remote_async methods.
Remote Fanout
If you essentially want to do the same operation, but with different arguments, and you want to block on all of them,
you can use the {name}_remote_fanout method.
Job Attribute Options
The lucidity::job proc macro has a few options that can be used to customize the behavior of the generated methods.
Generally, this do not need to be used, but they are available if you need them.
init_retry_interval_ms: This is the number of milliseconds to wait between retries when trying to initialize aProcess. Defaults to100.sync_retry_interval_ms: This is the number of milliseconds to wait between retries when trying to get a blocking (e.g.,{name}_localor{name}_remote) from aProcess. Defaults to100.async_init_retry_interval_ms: This is the number of milliseconds to wait between retries when trying to initialize aProcessasynchronously (e.g.,{name}_local_asyncor{name}_remote_async). Defaults to100.async_get_retry_interval_ms: This is the number of milliseconds to wait between retries when trying to get a non-blocking result (e.g.,{name}_local_asyncor{name}_remote_async) from aProcess. Defaults to100.async_set_retry_interval_ms: This is the number of milliseconds to wait between retries when the executionProcessattempts to set a non-blocking result (e.g.,{name}_local_asyncor{name}_remote_async) from aProcess. Defaults to100.shutdown_retry_interval_ms: This is the number of milliseconds to wait between retries when trying to shutdown aProcess. Defaults to100.memory: This is the amount of maximum memory allowed to theProcess. Defaults to100 * 1024 * 1024(100MB).fuel: This is the amount of maximum fuel allowed to theProcess. Defaults to10(each unit of fuel is approximately 100,000 WASM instructions).fanout: This is the type of scheme to use when fanning out. Defaults toroundrobin. The other option israndom.
Feature Flags
fly: This enables theflyfeature, which allows you to use thefly.ioplatform to automatically set up nodes from the mainlunaticnode. This is not enabled by default, as it requires afly.ioaccount, and a bit of setup. See thefly.iodocumentation for more information.NOTE: This functionality is also (currently) rendered useless by limitations with UDP on
fly.io. I am working with thefly.ioteam in the forums to resolve this issue.
Test
Thanks
Special thanks to the lunatic's authors and contributors for their excellent work, and special thanks to the primary author, bkolobara.
License
MIT