# Scyros
[](https://github.com/fxpl/scyros/actions)

[](
https://releases.rs/docs/1.93.0/
)
[](https://crates.io/crates/scyros)
A framework to design sound, reproducible and scalable mining repositories studies on GitHub.
### Scyros is...
- 🧪 **Reproducibility-first**: declarative configuration and deterministic execution to enable repeatable experiments.
- 📈 **Scalable**: designed for large-scale repository mining studies on GitHub.
- 🧱 **Soundness-focused**: encourages transparent, bias-aware, and methodologically explicit study design.
- ⚙️ **Modular**: independent, reusable modules that can be composed into custom data-processing pipelines.
## Table of Contents
- [Installation](#installation)
- [Tutorial](#tutorial)
- [Usage](#usage)
- [Authentication and Rate Limits](#authentication-and-rate-limits)
- [Citing Scyros](#citing-scyros)
- [License](#license)
- [Change Log](#change-log)
## Installation
### Prebuilt binaries
Prebuilt binaries for macOS, Linux, and Windows are available on the project's [GitHub Releases page](https://github.com/fxpl/scyros/releases), along with installer scripts.
### Using a package manager
Scyros is available through several package managers.
### Cargo
Scyros is published on [crates.io](https://crates.io/crates/scyros) and can be installed with Cargo:
```bash
cargo install scyros
```
### Nix
If you use Nix with flakes enabled, you can install Scyros directly from GitHub:
```nix
nix profile install github:fxpl/scyros
```
### Build from source
Install Rust (version 1.94 or newer) by following the instructions on the [official website](https://rust-lang.org/tools/install/).
Then clone the repository and build:
```bash
git clone git@github.com:fxpl/scyros.git
cd scyros
cargo build --release
```
The binary is produced at `target/release/scyros`. You can optionally move it to a directory in your PATH for easier access.
## Tutorial
If you'd like to see how to use Scyros in practice, check out the [interactive tutorial](https://github.com/fxpl/scyros-tutorial)!
## Usage
To discover available commands and modules:
```bash
scyros --help
```
Each module provides its own usage documentation. For example, to inspect the module used to sample random repositories from GitHub:
```bash
scyros ids --help
```
## Authentication and Rate Limits
Some modules interact with the GitHub API and require personal access tokens (PATs). Tokens can be created by following GitHub’s documentation: [https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token](https://docs.github.com/en/github/authenticating-to-github/creating-a-personal-access-token).
⚠️ Never commit or share your tokens publicly.
Tokens must be provided as a CSV file passed via a command-line argument. The file must contain a single column named token, with one token per line:
```csv
token
fa56454....
hj73647....
```
GitHub enforces API rate limits. Using multiple tokens from the same account does not increase these limits. Users are expected to comply with GitHub’s API terms and rate-limit policies:
- [Rate limits for the REST API](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28)
- [Terms of Service](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service)
## Citing Scyros
Scyros is introduced and described in the following large-scale empirical study. If you use Scyros in academic work, please cite:
```bibtex
@article{gilot2026largescalestudyfloatingpointusage,
author = {Gilot, Andrea and Wrigstad, Tobias and Darulova, Eva},
title = {Floating-Point Usage on GitHub: A Large-Scale Study of Statically Typed Languages},
year = {2026},
issue_date = {April 2026},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {10},
number = {OOPSLA1},
url = {https://doi.org/10.1145/3798203},
doi = {10.1145/3798203},
journal = {Proc. ACM Program. Lang.},
month = apr,
articleno = {95},
numpages = {28},
keywords = {floating-point arithmetic, large-scale code analysis, repository mining}
}
```
> Andrea Gilot, Tobias Wrigstad, and Eva Darulova. 2026. Floating-Point Usage on GitHub: A Large-Scale Study of Statically Typed Languages. Proc. ACM Program. Lang. 10, OOPSLA1, Article 95 (April 2026), 28 pages. https://doi.org/10.1145/3798203
## License
This project is licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.
## Change Log
See [CHANGELOG.md](CHANGELOG.md) for a detailed list of changes and updates.