Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Noria: a high-performance web applications backend
Noria is a new streaming data-flow system designed to act as a fast storage backend for read-heavy web applications based on this paper from OSDI'18. It acts like a databases, but pre-computes and caches relational query results so that reads are blazingly fast. Noria automatically keeps cached results up-to-date as the underlying data, stored in persistent base tables change. Noria uses partially-stateful data-flow to reduce memory overhead, and supports dynamic, runtime data-flow and query change.
Noria comes with a MySQL adapter that implements the binary MySQL protocol. This lets any application that currently talks to MySQL or MariaDB switch to Noria with minimal effort. For example, running a Lobsters-like workload that issues the same SQL queries as the real Lobsters website, Noria improves throughput supported by up to 5x:
At a high level, Noria takes a set of parameterized SQL queries (think prepared statements), and produces a data-flow program that maintains materialized views for the output of those queries. Reads now become fast lookups directly into these materialized views, as if the value had been directly cached in memcached. The views are then kept up-to-date incrementally through the data-flow, which yields high write throughput.
Running Noria
You need nightly Rust to run this code. This will be arranged for
automatically if you're using rustup.rs
.
You build the Noria library and its associated worker binary with
$ cargo build --release
To start a long-running Noria worker, ensure that ZooKeeper is running, and then run:
$ cargo r --release --bin souplet -- --deployment myapp --no-reuse --address 172.16.0.19 --shards 0
myapp
here is a deployment. Many Noria workers can operate in a
single deployment at the same time, and will share the workload between
them. Workers in the same deployment automatically elect a leader and
discovery each other via ZooKeeper.
Once the Noria worker is running, you can discover its REST API port through Zookeeper via this command:
$ cargo run --manifest-path=consensus/Cargo.toml --bin zk-util -- \
--show --deployment testing
| grep external | cut -d' ' -f4
A basic graphical UI runs at http://IP:PORT/graph.html
and shows
the running data-flow graph.
You can now start the MySQL
adapter, and it should
automatically locate the running worker through ZooKeeper (use -z
if
it ZooKeeper is not running on localhost:2181
).
You should then be able to point your application at localhost:3306
to
send queries to Noria. If your application crashes, this is a bug, and
we would appreciate it if you open an
issue. You may also want to
try to disable automatic re-use (with --no-reuse
) or sharding (with
--shards 0
) in case those are misbehaving.
You can manually inspect the database using the mysql
CLI, or by
using the Noria web interface.
You can also run the self-contained Noria example, which does not require ZooKeeper:
$ cargo run --example basic-recipe
Noria development
Noria is a large piece of software that spans many sub-crates and
external tools (see links in the text above). Each sub-crate is
responsible for a component of Noria's architecture, such as external
API (api
), mapping SQL to data-flow (mir
), and executing data-flow
operators (dataflow
). The code in src/
is the glue that ties these
pieces together by establishing materializations, scheduling data-flow
work, orchestrating Noria program changes, handling failovers, etc.
src/lib.rs
has a pretty extensive comment at the top of
it that goes through how the Noria internals fit together at an
implementation level. While it occasionally lags behind, especially
following larger changes, it should serve to get you familiarized with
the basic building blocks relatively quickly.
The sub-crates each serve a distinct role:
api/
: everything that an external program communicating with Noria needs. This includes types used in RPCs as arguments/return types, as well as code for discovering Noria workers through ZooKeeper, establishing a connection to Noria through ZooKeeper, and invoking the various RPC exposed by the Noria controller (src/controller/inner.rs
).basics/
: core data-structures and types used throughout Noria, includingDataType
(Noria's "value" type), node addresses, base table operations, etc.benchmarks/
: various Noria benchmarks, one in each folder. These will likely move into noria-benchmarks in the near future. The most frequently used one isvote
, which runs the vote benchmark from ยง8.2 of the OSDI paper. You can run it in a bunch of different ways (--help
should be useful), and with a bunch of different backends. Thelocalsoup
backend is the one that's easiest to get up and running with.channel/
: a wrapper around TCP channels that Noria uses to communicate between clients and servers, and inside the data-flow graph. At this point, this is mostly a thin wrapper aroundasync-bincode
, and it might go away in the long run.consensus/
: code for interacting with ZooKeeper to determine which Noria worker acts as the controller, and for detecting failed controllers which necessitate a controller changeover.dataflow/
: the code that implements the internals of the data-flow graph. This includes implementations of the different operators (ops/
), "special" operators like leaf views and sharders (node/special/
), implementations of view storage (state/
), and the code that coordinates execution of control, data, and backfill messages within a thread domain (domain/
).mir/
: the code that implements Noria's SQL-to-dataflow mapping. This includes resolving columns and keys, creating dataflow operators, and detecting reuse opportunities, and triggering migrations to make changes after new SQL queries have been added. @ms705 is the primary author of this particular subcrate, and it builds largely uponnom-sql
.src/
: the "high-level" components of Noria such as RPC handling, domain scheduling, connection management, and all the controller operations (listening for heartbeats, handling failed workers, etc.).
To run the test suite, use:
$ cargo test
Build and open the documentation with:
$ cargo doc --open