# Distributed sccache
Background:
- You should read about JSON Web Tokens - https://jwt.io/.
- HS256 in short: you can sign a piece of (typically unencrypted)
data with a key. Verification involves signing the data again
with the same key and comparing the result. As a result, if you
want two parties to verify each others messages, the key must be
shared beforehand.
- 'secure token's referenced below should be generated with a CSPRNG
(your OS random number generator should suffice).
For example, on Linux this is accessible with: `openssl rand -hex 64`.
- When relying on random number generators (for generating keys or
tokens), be aware that a lack of entropy is possible in cloud or
virtualized environments in some scenarios.
## Overview
Distributed sccache consists of three parts:
- the client, an sccache binary that wishes to perform a compilation on
remote machines
- the scheduler (`sccache-dist` binary), responsible for deciding where
a compilation job should run
- the server (`sccache-dist` binary), responsible for actually executing
a build
All servers are required to be a 64-bit Linux or a FreeBSD install. Clients
may request compilation from Linux, Windows or macOS. Linux compilations will
attempt to automatically package the compiler in use, while Windows and macOS
users will need to specify a toolchain for cross-compilation ahead of time.
## Communication
The HTTP implementation of sccache has the following API, where all HTTP body content is encoded using [`bincode`](http://docs.rs/bincode):
- scheduler
- `POST /api/v1/scheduler/alloc_job`
- Called by a client to submit a compilation request.
- Returns information on where the job is allocated it should run.
- `GET /api/v1/scheduler/server_certificate`
- Called by a client to retrieve the (dynamically created) HTTPS
certificate for a server, for use in communication with that server.
- Returns a digest and PEM for the temporary server HTTPS certificate.
- `POST /api/v1/scheduler/heartbeat_server`
- Called (repeatedly) by servers to register as available for jobs.
- `POST /api/v1/scheduler/job_state`
- Called by servers to inform the scheduler of the state of the job.
- `GET /api/v1/scheduler/status`
- Returns information about the scheduler.
- `server`
- `POST /api/v1/distserver/assign_job`
- Called by the scheduler to inform of a new job being assigned to this server.
- Returns whether the toolchain is already on the server or needs submitting.
- `POST /api/v1/distserver/submit_toolchain`
- Called by the client to submit a toolchain.
- `POST /api/v1/distserver/run_job`
- Called by the client to run a job.
- Returns the compilation stdout along with files created.
There are three axes of security in this setup:
1. Can the scheduler trust the servers?
2. Is the client permitted to submit and run jobs?
3. Can third parties see and/or modify traffic?
### Server Trust
If a server is malicious, they can return malicious compilation output to a user.
To protect against this, servers must be authenticated to the scheduler. You have three
means for doing this, and the scheduler and all servers must use the same mechanism.
Once a server has registered itself using the selected authentication, the scheduler
will trust the registered server address and use it for builds.
#### JWT HS256 (preferred)
This method uses secret key to create a per-IP-and-port token for each server.
Acquiring a token will only allow participation as a server if the attacker can
additionally impersonate the IP and port the token was generated for.
You *must* keep the secret key safe.
*To use it*:
Create a scheduler key with `sccache-dist auth generate-jwt-hs256-key` (which will
use your OS random number generator) and put it in your scheduler config file as
follows:
```
server_auth = { type = "jwt_hs256", secret_key = "YOUR_KEY_HERE" }
```
Now generate a token for the server, giving the IP and port the scheduler and clients can
connect to the server on (address `192.168.1.10:10501` here):
```
sccache-dist auth generate-jwt-hs256-server-token \
--secret-key YOUR_KEY_HERE \
--server 192.168.1.10:10501
```
*or:*
```
sccache-dist auth generate-jwt-hs256-server-token \
--config /path/to/scheduler-config.toml \
--server 192.168.1.10:10501
```
This will output a token (you can examine it with https://jwt.io if you're
curious) that you should add to your server config file as follows:
```
scheduler_auth = { type = "jwt_token", token = "YOUR_TOKEN_HERE" }
```
Done!
#### Token
This method simply shares a token between the scheduler and all servers. A token
leak from anywhere allows any attacker to participate as a server.
*To use it*:
Choose a 'secure token' you can share between your scheduler and all servers.
Put the following in your scheduler config file:
```
server_auth = { type = "token", token = "YOUR_TOKEN_HERE" }
```
Put the following in your server config file:
```
scheduler_auth = { type = "token", token = "YOUR_TOKEN_HERE" }
```
Done!
#### Insecure (bad idea)
*This route is not recommended*
This method uses a hardcoded token that effectively disables authentication and
provides no security at all.
*To use it*:
Put the following in your scheduler config file:
```
server_auth = { type = "DANGEROUSLY_INSECURE" }
```
Put the following in your server config file:
```
scheduler_auth = { type = "DANGEROUSLY_INSECURE" }
```
Done!
### Client Trust
If a client is malicious, they can cause a DoS of distributed sccache servers or
explore ways to escape the build sandbox. To protect against this, clients must
be authenticated.
Each client will use an authentication token for the initial job allocation request
to the scheduler. A successful allocation will return a job token that is used
to authorise requests to the appropriate server for that specific job.
This job token is a JWT HS256 token of the job id, signed with a server key.
The key for each server is randomly generated on server startup and given to
the scheduler during registration. This means that the server can verify users
without either a) adding client authentication to every server or b) needing
secret transfer between scheduler and server on every job allocation.
#### OAuth2
This is a group of similar methods for achieving the same thing - the client
retrieves a token from an OAuth2 service, and then submits it to the scheduler
which has a few different options for performing validation on that token.
*To use it*:
Put one of the following settings in your scheduler config file to determine how
the scheduler will validate tokens from the client:
```
# Use the known settings for Mozilla OAuth2 token validation
client_auth = { type = "mozilla" }
# Will forward the valid JWT token onto another URL in the `Bearer` header, with a
# success response indicating the token is valid. Optional `cache_secs` how long
# to cache successful authentication for.
client_auth = { type = "proxy_token", url = "...", cache_secs = 60 }
```
Additionally, each client should set up an OAuth2 configuration in the with one of
the following settings (as appropriate for your OAuth service):
```
# Use the known settings for Mozilla OAuth2 authentication
auth = { type = "mozilla" }
# Use the Authorization Code with PKCE flow. This requires a client id,
# an initial authorize URL (which may have parameters like 'audience' depending
# on your service) and the URL for retrieving a token after the browser flow.
auth = { type = "oauth2_code_grant_pkce", client_id = "...", auth_url = "...", token_url = "..." }
# Use the Implicit flow (typically not recommended due to security issues). This requires
# a client id and an authorize URL (which may have parameters like 'audience' depending
# on your service).
auth = { type = "oauth2_implicit", client_id = "...", auth_url = "..." }
```
The client should then run `sccache --dist-auth` and follow the instructions to retrieve
a token. This will be automatically cached locally for the token expiry period (manual
revalidation will be necessary after expiry).
#### Token
This method simply shares a token between the scheduler and all clients. A token
leak from anywhere allows any attacker to participate as a client.
*To use it*:
Choose a 'secure token' you can share between your scheduler and all clients.
Put the following in your scheduler config file:
```
client_auth = { type = "token", token = "YOUR_TOKEN_HERE" }
```
Put the following in your client config file:
```
auth = { type = "token", token = "YOUR_TOKEN_HERE" }
```
Done!
#### Insecure (bad idea)
*This route is not recommended*
This method uses a hardcoded token that effectively disables authentication and
provides no security at all.
*To use it*:
Put the following in your scheduler config file:
```
client_auth = { type = "DANGEROUSLY_INSECURE" }
```
Remove any `auth =` setting under the `[dist]` heading in your client config file
(it will default to this insecure mode).
Done!
### Eavesdropping and Tampering Protection
If third parties can see traffic to the servers, source code can be leaked. If third
parties can modify traffic to and from the servers or the scheduler, they can cause
the client to receive malicious compiled objects.
Securing communication with the scheduler is the responsibility of the sccache cluster
administrator - it is recommended to put a webserver with a HTTPS certificate in front
of the scheduler and instruct clients to configure their `scheduler_url` with the
appropriate `https://` address. The scheduler will verify the server's IP in this
configuration by inspecting the `X-Real-IP` header's value, if present. The webserver
used in this case should be configured to set this header to the appropriate value.
Securing communication with the server is performed automatically - HTTPS certificates
are generated dynamically on server startup and communicated to the scheduler during
the heartbeat. If a client does not have the appropriate certificate for communicating
securely with a server (after receiving a job allocation from the scheduler), the
certificate will be requested from the scheduler.
# Building the Distributed Server Binaries
Until these binaries [are included in releases](https://github.com/mozilla/sccache/issues/393) I've put together a Docker container that can be used to easily build a release binary:
```
docker run -ti --rm -v $PWD:/sccache luser/sccache-musl-build:0.1 /bin/bash -c "cd /sccache; cargo build --release --target x86_64-unknown-linux-musl --features=dist-server && strip target/x86_64-unknown-linux-musl/release/sccache-dist && cd target/x86_64-unknown-linux-musl/release/ && tar czf sccache-dist.tar.gz sccache-dist"
```