# Toraniko: Institutional-Scale Risk Model
A high-performance Rust implementation of a characteristic factor model suitable for quantitative and systematic trading at institutional scale. In particular, it is a characteristic factor model in the same vein as Barra and Axioma (in fact, given the same datasets, it approximately reproduces Barra's estimated factor returns).
[](https://crates.io/crates/toraniko)
[](https://docs.rs/toraniko)
[](https://opensource.org/licenses/MIT)
## Overview
Using this library, you can create new custom factors and estimate their returns. Then you can estimate a factor covariance matrix suitable for portfolio optimization with factor exposure constraints (e.g. to maintain a market neutral portfolio).
The library supports market, sector and style factors; three styles are included: value, size and momentum. It also comes with generalizable math and data cleaning utility functions you'd want to have for constructing more style factors (or custom fundamental factors of any kind).
### Mathematical Model
The factor model decomposes asset returns as:
```text
r_asset = beta_market * r_market + sum(beta_sector * r_sector) + sum(beta_style * r_style) + epsilon
```
Where:
- `r_market` is the market factor return (cap-weighted average)
- `r_sector` are sector factor returns (constrained to sum to zero, Barra-style)
- `r_style` are style factor returns (momentum, value, size)
- `epsilon` is the idiosyncratic/residual return
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
toraniko = "0.1"
```
Or with specific features:
```toml
[dependencies]
toraniko = { version = "0.1", default-features = false, features = ["model"] }
factors = "0.1" # For style factor implementations
```
### Available Features
| `full` (default) | All features enabled |
| `primitives` | Core type definitions |
| `traits` | Trait abstractions (Factor, StyleFactor, ReturnsEstimator) |
| `math` | Mathematical operations (WLS, winsorization, weights) |
| `model` | Factor return estimation |
| `utils` | Data utilities (fill, smooth, rank) |
Note: Style factor implementations (Momentum, Size, Value) are now provided by the separate `factors` crate.
## Quick Start with Real Market Data
The `yahoo_finance` example demonstrates fetching real market data and running the full factor model pipeline:
```bash
cargo run --package toraniko --example yahoo_finance
```
This fetches ~500 days of daily data for 30 stocks across Technology, Healthcare, and Finance sectors directly from Yahoo Finance.
### Example Output (Real Data from Yahoo Finance)
```text
=== Toraniko Factor Model with Real Yahoo Finance Data ===
Fetching data from 2024-08-06 to 2025-12-23
AAPL - 347 quotes fetched
MSFT - 347 quotes fetched
GOOGL - 347 quotes fetched
META - 347 quotes fetched
...
Successfully fetched data for 30 stocks
```
**Returns DataFrame (real stock returns):**
```text
┌────────────┬────────┬────────────┬───────────────┐
│ date ┆ symbol ┆ sector ┆ asset_returns │
│ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ str ┆ str ┆ f64 │
╞════════════╪════════╪════════════╪═══════════════╡
│ 2024-08-07 ┆ UNH ┆ Healthcare ┆ -0.003994 │
│ 2024-08-08 ┆ UNH ┆ Healthcare ┆ 0.000283 │
│ 2024-08-09 ┆ UNH ┆ Healthcare ┆ -0.01321 │
│ 2024-08-12 ┆ UNH ┆ Healthcare ┆ 0.011687 │
│ 2024-08-13 ┆ UNH ┆ Healthcare ┆ 0.015832 │
└────────────┴────────┴────────────┴───────────────┘
```
**Computed Style Scores (cross-sectionally standardized):**
```text
┌────────────┬────────┬───────────┬───────────┐
│ date ┆ symbol ┆ sze_score ┆ val_score │
│ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ str ┆ f64 ┆ f64 │
╞════════════╪════════╪═══════════╪═══════════╡
│ 2024-08-07 ┆ UNH ┆ 0.165245 ┆ -1.065573 │
│ 2024-08-08 ┆ UNH ┆ 0.25773 ┆ -0.047803 │
│ 2024-08-09 ┆ UNH ┆ 0.212605 ┆ 0.13597 │
│ 2024-08-12 ┆ UNH ┆ 0.218459 ┆ -0.306406 │
│ 2024-08-13 ┆ UNH ┆ 0.25663 ┆ -0.153037 │
└────────────┴────────┴───────────┴───────────┘
sze_score | mean: -0.0000 | std: 0.9832 | min: -5.1927 | max: 0.6928
val_score | mean: -0.0000 | std: 0.4858 | min: -1.6171 | max: 3.1439
```
**Annualized Factor Performance (real market data, Aug 2024 - Dec 2025):**
```text
market | 21.26% | 18.64% | 1.14
sector_Finance | 21.86% | 13.20% | 1.66
sector_Healthcare | -4.83% | 14.85% | -0.33
sector_Technology | -17.03% | 12.17% | -1.40
sze_score | -44.93% | 6.65% | -6.75
val_score | -191.69% | 19.64% | -9.76
```
## Required Data Inputs
You'll need the following data to run a complete model estimation:
### 1. Daily Asset Returns and Market Data
Symbol-by-symbol daily data including returns, market caps, and valuation metrics:
```text
┌────────────┬────────┬────────────┬───────────────┐
│ date ┆ symbol ┆ sector ┆ asset_returns │
│ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ str ┆ str ┆ f64 │
╞════════════╪════════╪════════════╪═══════════════╡
│ 2024-08-07 ┆ AAPL ┆ Technology ┆ -0.004912 │
│ 2024-08-07 ┆ MSFT ┆ Technology ┆ 0.012543 │
│ 2024-08-07 ┆ JNJ ┆ Healthcare ┆ 0.003211 │
│ 2024-08-07 ┆ JPM ┆ Finance ┆ 0.008734 │
└────────────┴────────┴────────────┴───────────────┘
```
### 2. Sector Memberships (One-Hot Encoded)
One-hot encoded sector classifications for the factor model regression:
```text
┌────────────┬────────┬───────────────────┬───────────────────┬────────────────┐
│ date ┆ symbol ┆ sector_Technology ┆ sector_Healthcare ┆ sector_Finance │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ date ┆ str ┆ f64 ┆ f64 ┆ f64 │
╞════════════╪════════╪═══════════════════╪═══════════════════╪════════════════╡
│ 2024-08-07 ┆ AAPL ┆ 1.0 ┆ 0.0 ┆ 0.0 │
│ 2024-08-07 ┆ MSFT ┆ 1.0 ┆ 0.0 ┆ 0.0 │
│ 2024-08-07 ┆ JNJ ┆ 0.0 ┆ 1.0 ┆ 0.0 │
│ 2024-08-07 ┆ JPM ┆ 0.0 ┆ 0.0 ┆ 1.0 │
└────────────┴────────┴───────────────────┴───────────────────┴────────────────┘
```
## Usage Examples
### Computing Style Factor Scores
```rust,ignore
use factors::{Factor, FactorRegistry, FactorConfig};
use polars::prelude::*;
use chrono::NaiveDate;
// Create registry with default factors
let registry = FactorRegistry::with_defaults();
// Get factors by name
let momentum = registry.get("long_term_momentum").unwrap();
let size = registry.get("log_market_cap").unwrap();
let value = registry.get("book_to_market").unwrap();
// Or create a custom registry with custom configuration
let mut custom_registry = FactorRegistry::new();
let custom_config = FactorConfig {
trailing_days: 252, // 1 year lookback
half_life: 63, // 3 month exponential decay
lag: 20, // Skip most recent month
winsor_factor: 0.01, // 1% winsorization
..Default::default()
};
custom_registry.register_with_config("custom_momentum", custom_config);
// Compute scores from your data
let date = NaiveDate::from_ymd_opt(2024, 1, 1).unwrap();
let mom_scores = momentum.compute(&data.lazy(), date)?;
let sze_scores = size.compute(&data.lazy(), date)?;
let val_scores = value.compute(&data.lazy(), date)?;
```
### Estimating Factor Returns
```rust,ignore
use toraniko::model::{FactorReturnsEstimator, EstimatorConfig};
use toraniko::traits::ReturnsEstimator;
use toraniko::utils::top_n_by_group;
// Select top 3000 stocks by market cap for each date
let universe = top_n_by_group(
merged_df.lazy(),
3000, // top N
"market_cap", // ranking variable
&["date"], // group by
true, // filter (vs. add mask column)
);
// Configure the estimator
let config = EstimatorConfig {
winsor_factor: Some(0.05), // 5% winsorization on returns
residualize_styles: true, // Orthogonalize styles to sectors
};
let estimator = FactorReturnsEstimator::with_config(config);
// Estimate factor returns
let (factor_returns, residuals) = estimator.estimate(
returns_df.lazy(),
mkt_cap_df.lazy(),
sector_df.lazy(),
style_df.lazy(),
)?;
```
### Data Utilities
```rust,ignore
use toraniko::utils::{fill_features, smooth_features, top_n_by_group};
// Forward-fill missing values within each symbol
let filled = fill_features(
df.lazy(),
&["price", "volume"], // columns to fill
"date", // sort column
"symbol", // partition column
);
// Smooth with rolling mean
let smoothed = smooth_features(
df.lazy(),
&["returns"], // columns to smooth
"date", // sort column
"symbol", // partition column
5, // window size
);
```
## Performance
On an M1 MacBook, this estimates 10+ years of daily market, sector and style factor returns in under a minute.
Single-day WLS factor estimation benchmarks:
| 1,000 | ~74 microseconds |
| 3,000 | ~222 microseconds |
| 5,000 | ~538 microseconds |
Run benchmarks with:
```bash
cargo bench
```
## Running the Examples
The crate includes several examples demonstrating key functionality:
```bash
# Fetch real market data from Yahoo Finance and run the full pipeline
cargo run --package toraniko --example yahoo_finance
# Style factor computation
cargo run --package toraniko --example style_factors
# Factor return estimation
cargo run --package toraniko --example factor_returns
# Data utility functions
cargo run --package toraniko --example data_utilities
# Generate synthetic market data (for offline testing)
cargo run --package toraniko --example generate_data
# Full end-to-end pipeline with synthetic data
cargo run --package toraniko --example full_pipeline
```
## Attribution
This is a Rust port of the original Python [toraniko](https://github.com/0xfdf/toraniko) library by [0xfdf](https://github.com/0xfdf). The original implementation provides the mathematical foundation and algorithmic approach that this crate follows.
## License
MIT License - see [LICENSE](../../LICENSE) for details.