stochastic-rs 2.2.0

# stochastic-rs documentation site — plan

A maintainable, Vercel-deployed documentation site for the `stochastic-rs`
workspace. Living document — update as the site is built and the project
evolves.

## 0. Locked-in decisions

| Item       | Choice                                                       |
|------------|--------------------------------------------------------------|
| Framework  | **Fumadocs** (Next.js 15 App Router, MDX v10)                |
| Location   | `website/` at the workspace root                             |
| Language   | English only                                                  |
| Scope      | Full — guides + per-feature catalog + tutorials + Python parity + bench dashboard |
| Hosting    | Vercel (GitHub integration, preview deploys per PR)          |

Rationale: Fumadocs has the best out-of-the-box DX for math-heavy technical
docs (KaTeX wiring, MDX components like `Cards`/`Tabs`/`Steps`, fast Orama
search). `website/` keeps the site in-tree without colliding with the
existing audit notes under `docs/`. English-only matches the international
crates.io / PyPI audience.

## 1. Tech stack

- `fumadocs-ui` + `fumadocs-mdx` (latest)
- Next.js 15 (App Router, RSC)
- Tailwind v4
- KaTeX via `remark-math` + `rehype-katex` (wired in `source.config.ts`)
- Shiki for code highlighting (presets: `rust`, `python`, `bash`, `toml`)
- Orama search (built into Fumadocs; in-memory index)
- Optional: Mermaid via `@theguild/remark-mermaid` for the rare diagram

Custom React components (under `website/components/`):

- `BenchDashboard` — renders criterion JSON as sortable table with sparklines
- `PythonParityTable` — Rust-vs-Python feature parity matrix
- `ProcessCatalog` — auto-grids cards for a category from frontmatter
- `SDEFormula` — consistent SDE rendering wrapper around KaTeX
- `PaperRef` — reference card with DOI / arXiv link
- `RustExample` — `include`-s a file from `examples/` or `tests/`

## 2. Information architecture

Top-level sections, with **page counts** scoped to what already exists in the
Rust workspace today (counts auditable from `find … -name '*.rs'`):

| § | Section                | Pages | Priority |
|---|------------------------|-------|----------|
| 1 | Getting Started        | 4     | P0       |
| 2 | Concepts               | 8     | P0       |
| 3 | Stochastic Processes   | ~80   | P1       |
| 4 | Distributions          | 22    | P1       |
| 5 | Copulas                | 10    | P2       |
| 6 | Statistics & Estimators| ~30   | P2       |
| 7 | Quant Finance          | ~80   | P1       |
| 8 | AI Surrogates          | 5     | P2       |
| 9 | Visualization          | 2     | P3       |
| 10| Python Bindings        | 6     | P1       |
| 11| Tutorials              | 8–10  | P0       |
| 12| Benchmarks             | 1     | P2       |
| 13| API Reference          | 2     | P3       |
| 14| Migration (v1 → v2)    | 2     | P0       |
| 15| Contributing           | 5     | P2       |

Total ≈ **270 pages**. The vast majority follow strict templates (§3) so
authoring is mechanical.

### Concrete catalog inventory

Audited from current `src/` trees (2026-05-10):

- **Diffusion processes** (30): `ait_sahalia`, `cev`, `cfou`, `cir`, `ckls`,
  `fcir`, `feller`, `feller_root`, `fgbm`, `fjacobi`, `fou`, `fouque`, `gbm`,
  `gbm_ih`, `gbm_log`, `gompertz`, `hyperbolic`, `hyperbolic2`, `jacobi`,
  `kimura`, `linear_sde`, `logistic`, `modified_cir`, `nonlinear_sde`, `ou`,
  `pearson`, `quadratic`, `radial_ou`, `regime_switching`, `three_half`,
  `verhulst`
- **Jump processes** (16): `bates`, `bilateral_gamma`, `cgmy`, `cts`,
  `hawkes_jd`, `ig`, `jump_fou`, `jump_fou_custom`, `kobol`, `kou`,
  `levy_diffusion`, `merton`, `mjd_log`, `nig`, `rdts`, `vg`
- **Volatility processes** (11): `bates_svj`, `bergomi`, `double_heston`,
  `fbates_svj`, `fheston`, `heston`, `heston_log`, `hkde`, `rbergomi`,
  `sabr`, `svcgmy`
- **Interest-rate processes** (14): `adg`, `bgm`, `cir`, `cir_2f`,
  `duffie_kan`, `duffie_kan_jump_exp`, `fractional_vasicek`, `hjm`, `ho_lee`,
  `hull_white`, `hull_white_2f`, `lmm`, `vasicek`, `wu_zhang`
- **Rough processes** (6): `kernel`, `markov_lift`, `rl_bs`, `rl_fbm`,
  `rl_fou`, `rl_heston`
- **Noise** (5): `cfgns`, `cgns`, `fgn`, `gn`, `wn`
- **Distributions** (22 files; 19 user-facing types): `alpha_stable`, `beta`,
  `binomial`, `cauchy`, `chi_square`, `complex`, `exp`, `gamma`, `geometric`,
  `hypergeometric`, `inverse_gauss`, `lognormal`, `non_central_chi_squared`,
  `normal`, `normal_inverse_gauss`, `pareto`, `poisson`, `studentt`,
  `uniform`, `weibull` + traits / float impls / special funcs
- **Copulas — bivariate** (4): `clayton`, `frank`, `gumbel`, `independence`
- **Copulas — multivariate** (3): `gaussian`, `tree`, `vine`
- **Pricers** (~30): `asian`, `autocallable`, `barrier`, `basket`, `bermudan`,
  `bjerksund_stensland`, `breeden_litzenberger`, `bsm`, `cgmysv`, `chooser`,
  `cliquet`, `compound`, `digital`, `dupire`, `engines`, `execution_cost`,
  `finite_difference`, `fourier`, `heston`, `heston_stoch_corr`, `kirk`,
  `lookback`, `malliavin_gbm`, `malliavin_greeks`, `merton_jump`, `pnl`,
  `rainbow`, `rbergomi`, `regime_switching`, `sabr`, `slv`, `snell_envelope`,
  `spread`, `variance_swap`
- **Calibrators** (12): `bsm`, `cgmysv`, `double_heston`, `heston`,
  `heston_stoch_corr`, `hkde`, `hw_swaption`, `levy`, `rbergomi`, `sabr`,
  `sabr_caplet`, `svj`
- **Stats** — MLE (4 files: `density`, `fit`, `process_impls`), realized
  (`bipower`, `har`, `kernel`, `pre_averaging`, `two_scale`, `variance`),
  econometrics (`changepoint`, `cointegration`, `granger`, `hmm`),
  filtering (`mcmc`, `particle`, `ukf`), normality (`anderson_darling`,
  `jarque_bera`, `shapiro_francia`), stationarity (`adf`, `ers_dfgls`,
  `kpss`, `leybourne_mccabe`, `phillips_perron`), plus standalone
  (`fukasawa_hurst`, `gaussian_kde`, `heston_mle`, `heston_nml_cekf`,
  `tail_index`, `spectral`, `leverage`, `fou_estimator`, `cir`,
  `double_exp`)
- **AI surrogates** (3 model types under `volatility/`): `heston`,
  `one_factor` (Bergomi-1F), `rbergomi`
- **Quant — vol surface** (7): `analytics`, `arbitrage`, `implied`,
  `model_surface`, `pipeline`, `sabr_smile`, `ssvi`, `svi`
- **Quant — risk** (8): `credit`, `drawdown`, `execution`,
  `expected_shortfall`, `greeks`, `performance`, `scenario`, `var`
- **Quant — credit** (5): `bootstrap`, `cds`, `merton`, `migration`,
  `survival_curve`
- **Quant — curves** (8): `bootstrap`, `discount_curve`, `interpolation`,
  `linalg`, `multi_curve`, `nelson_siegel`, `svensson`, `types`
- **Quant — bonds** (3): `cir`, `hull_white`, `vasicek`
- **Quant — cashflows** (4): `coupon`, `engine`, `leg`, `types`
- **Quant — portfolio** (6): `covariance`, `data`, `engine`, `momentum`,
  `optimizers`, `types`
- **Quant — microstructure**: Almgren-Chriss, Kyle, Bouchaud propagator,
  Roll / Corwin-Schultz, order book
- **Quant — factors / strategies**: PCA, Fama-MacBeth, Ledoit-Wolf,
  pairs trading, regime engine

## 3. Page templates

Eight templates cover ~95% of the catalog. Each enforced by frontmatter
schema (zod).

### 3.1 Process page (used ~80×)

```mdx
---
title: Ornstein-Uhlenbeck (OU)
description: Mean-reverting Gaussian diffusion with constant volatility.
category: process
subcategory: diffusion
crate: stochastic-rs-stochastic
module_path: stochastic_rs::stochastic::diffusion::ou
since: 2.0.0
status: stable
references:
  - { author: "Uhlenbeck & Ornstein", year: 1930, title: "On the theory of the Brownian motion" }
---

# Ornstein-Uhlenbeck (OU)

> One-line summary.

## SDE

$$ dX_t = \theta(\mu - X_t)\,dt + \sigma\,dW_t,\quad X_0 = x_0 $$

Closed-form transition density: $X_t \mid X_s \sim \mathcal N(\dots)$.

## Constructor

| Parameter | Type | Description    |
|-----------|------|----------------|
| `theta`   | `T`  | Mean-reversion |
| `mu`      | `T`  | Long-run mean  |
| `sigma`   | `T`  | Volatility     |
| `n`       | `usize` | Steps       |
| `x0`      | `Option<T>` | Initial value |
| `t`       | `Option<T>` | Horizon |

## Examples

<Tabs items={['Rust', 'Python']}>
<Tab value="Rust">
<RustExample path="examples/ou_quickstart.rs" />
</Tab>
<Tab value="Python">
```python
from stochastic_rs import OU
p = OU(theta=2.0, mu=0.0, sigma=1.0, n=1000, x0=0.0, t=1.0)
path = p.sample()
```
</Tab>
</Tabs>

## Properties

- **Stationary**: yes (Gaussian, mean $\mu$, variance $\sigma^2/(2\theta)$)
- **Markov**: yes
- **Closed-form transition**: yes
- **GPU**: no (CPU SIMD via `f64x4`)

## Calibration / estimation

See [MLE for OU](/docs/stats/mle/process-impls#ou) and
[Fukasawa Hurst](/docs/stats/hurst/fukasawa) for fOU variants.

## See also

- [fOU](/docs/processes/diffusion/fou) — fractional OU
- [CIR](/docs/processes/diffusion/cir) — square-root OU variant
- [Vasicek](/docs/processes/interest/vasicek) — interest-rate parameterisation

## References

<PaperRef doi="..." title="..." />
```

### 3.2 Distribution page (used 19×)

Sections: PDF / CDF / characteristic function (with $\LaTeX$ math), constructor,
Rust + Python example, **closed-form moments** from `DistributionExt` (mean,
variance, skewness, kurtosis), KS-test note, references. Mark explicitly when
a moment is `unimplemented!` per
`memory/project_distribution_ext_status.md`.

### 3.3 Pricer page (~30×)

Sections: model, payoff, method (closed-form / Fourier / MC / FD / lattice),
constructor signature, Rust + Python example, Greeks support
(✅ via `GreeksExt` / ❌), calibration link if applicable, complexity
(O(steps · paths) etc.), references. For MC pricers note variance-reduction
support.

### 3.4 Calibrator page (~12×)

Sections: model, market data needed (strikes / IVs / quote types),
optimiser (LM / DE / NMLE-CEKF / Cui analytic Jacobian), `CalibrationResult`
fields, success criteria (typical RMSE), Rust example, references. Always
link back to the corresponding pricer page.

### 3.5 Estimator page (~30×)

Sections: estimand, method, result struct fields, asymptotic distribution
or bootstrap-CI note, p-value semantics, Rust + Python example, paper
reference. **Required**: cite the paper the implementation follows
verbatim (per `feedback_implementation.md`).

### 3.6 Copula page (10×)

Sections: $C(u, v; \theta)$ formula, dependence parameter range, Kendall's
$\tau$ / Spearman's $\rho$ closed forms, sampling algorithm, tail-dependence
coefficients, Rust + Python example. Per `copula-bivariate` SKILL.

### 3.7 AI surrogate page (3×)

Sections: model spec (`StochVolModelSpec`), input scaler / output scaler,
training-set source (gzip-npy path), `train_save_load` test reference,
`predict_surface` integration with `ImpliedVolSurface::from_flat_iv_grid`,
benchmark vs analytical / Fourier baseline. Per `vol-surrogate-nn` SKILL.

### 3.8 Concept page (8×)

Free-form: `prelude`, trait deep-dives (`FloatExt`, `ProcessExt`,
`DistributionExt`, `PricerExt`, `Calibrator`), feature flags, design
philosophy. Cross-link to `dev-rules` SKILL where appropriate.

## 4. File tree

```
website/
├── package.json
├── pnpm-lock.yaml
├── next.config.mjs
├── source.config.ts        # MDX + KaTeX config
├── tailwind.config.ts
├── tsconfig.json
├── README.md               # build / preview / deploy
├── public/
│   ├── bench/*.json        # criterion exports
│   ├── python-parity.json  # generated by scripts/python-parity.ts
│   └── og/*.png            # OpenGraph images
├── app/
│   ├── (home)/page.tsx
│   ├── docs/[[...slug]]/page.tsx
│   ├── docs/layout.tsx
│   ├── api/search/route.ts
│   ├── layout.config.tsx
│   ├── layout.tsx
│   └── global.css
├── content/docs/
│   ├── meta.json
│   ├── index.mdx
│   ├── getting-started/
│   ├── concepts/
│   ├── processes/{diffusion,jump,volatility,interest,rough,noise}/
│   ├── distributions/
│   ├── copulas/{bivariate,multivariate}/
│   ├── stats/{mle,realized,econometrics,filtering,normality,stationarity,hurst}/
│   ├── quant/{pricing,calibration,instruments,vol-surface,risk,credit,curves,bonds,cashflows,portfolio,microstructure,factors}/
│   ├── ai/
│   ├── viz/
│   ├── python/
│   ├── tutorials/
│   ├── benchmarks/
│   ├── api/
│   ├── migration/
│   └── contributing/
├── components/
│   ├── BenchDashboard.tsx
│   ├── PythonParityTable.tsx
│   ├── ProcessCatalog.tsx
│   ├── SDEFormula.tsx
│   ├── PaperRef.tsx
│   └── RustExample.tsx
└── scripts/
    ├── python-parity.ts    # generate parity-table JSON from py crate AST
    ├── docs-audit.ts       # diff Rust public API vs docs pages
    └── lint-mdx.ts         # frontmatter schema check
```

## 5. Source-to-doc consistency

Two rails to prevent rot:

### 5.1 Doctest-backed examples (preferred for hot pages)

Each Rust example for landing / quickstart / top-N processes is also a
file under `tests/doctest_*.rs` (or `examples/`). The MDX `<RustExample>`
component reads the file at build time and inlines the contents inside a
fenced `rust` block. Result: `cargo test --workspace` validates the
examples; broken examples fail CI before the docs deploy.

### 5.2 Inline + sha-pinned (fallback for the long tail)

Pages that can't be doctested annotate each fence with
`// last-checked: <commit-sha>`. The `scripts/docs-audit.ts` quarterly job
flags pages whose pinned sha is older than 90 days while the corresponding
Rust source has changed.

Recommendation: start everything with §5.2 (fast), promote any page that
graduates into the top-50-traffic set to §5.1.

## 6. Per-page metadata schema

Strict zod schema in `source.config.ts`:

```ts
{
  title: z.string().min(1),
  description: z.string().min(20).max(160),
  category: z.enum(['process', 'distribution', 'copula', 'estimator',
                    'pricer', 'calibrator', 'concept', 'tutorial',
                    'reference']),
  subcategory: z.string().optional(),
  crate: z.string().regex(/^stochastic-rs(-\w+)?$/),
  module_path: z.string(),                // rustdoc path
  since: z.string().regex(/^\d+\.\d+/),   // semver
  status: z.enum(['stable', 'experimental', 'deprecated']),
  features: z.array(z.string()).default([]),  // required Cargo features
  references: z.array(z.object({
    author: z.string(),
    year: z.number(),
    title: z.string(),
    doi: z.string().optional(),
    arxiv: z.string().optional(),
    url: z.string().url().optional(),
  })).default([]),
}
```

The `module_path` lets us auto-render a "View on docs.rs" button on every
catalog page.

## 7. Custom components (sketch)

### `<BenchDashboard />`

```
Inputs:  /public/bench/*.json (criterion exports, one per bench)
Output:  Sortable table, columns: bench / ns/iter / throughput /
         delta vs `rc2` baseline / sparkline (last 10 commits)
Update:  GitHub Action `bench-publish.yml` runs `cargo bench --bench X --
         --output-format bencher`, post-processes to JSON, commits to
         website/public/bench/. Triggered weekly + on tag.
```

### `<PythonParityTable />`

```
Inputs:  /public/python-parity.json (generated by scripts/python-parity.ts)
Status:  ✅ exposed | ⚠️ partial | ❌ Rust-only | 🔮 planned
Source:  Scan stochastic-rs-py/src/ for py_distribution! / py_process! /
         #[pyclass] macros, cross-reference against the catalog index.
Update:  `pnpm python:parity` runs the script. Commit on PR boundaries.
```

### `<ProcessCatalog category="diffusion" />`

```
Reads:   frontmatter from all MDX in content/docs/processes/<category>/
Renders: card grid (name, one-liner, 'View →' link)
```

### `<RustExample path="..." />`

```
Reads:   file at <workspace_root>/<path> (e.g. tests/doctest_ou.rs)
Renders: shiki-highlighted rust block
Errors:  build fails if the path doesn't exist (catches drift)
```

## 8. Deployment

### Vercel project

- Repo: `dancixx/stochastic-rs`
- Root directory: `website/`
- Framework preset: **Next.js**
- Build command: `pnpm build`
- Install command: `pnpm install --frozen-lockfile`
- Output: `.next` (default)
- Node: 20 LTS

### CI (`.github/workflows/website.yml`)

Triggered on `push` to `main` (paths-filter: `website/**`, `examples/**`,
`*/src/**`, `tests/**`):

1. `pnpm tsc --noEmit`
2. `pnpm lint` (eslint + frontmatter schema check via `scripts/lint-mdx.ts`)
3. `pnpm build` (Fumadocs build, fails on broken internal links)
4. `cargo test --test 'doctest_*'` — validates `<RustExample>` files
5. Vercel auto-deploys on green main.

PR previews: handled by the Vercel GitHub App (no extra workflow needed).

### Domain

- Primary candidate: **`stochastic-rs.dev`** (Google-managed `.dev` TLD,
  forced HTTPS, on-brand)
- Fallbacks: `stochastic-rs.org`, `stochastic-rs.com`
- Until registered: `stochastic-rs.vercel.app`

## 9. Phased rollout

Six phases, each shippable. Phase 0+1 = MVP.

| Phase | Effort     | Deliverable                                                   | Pages |
|-------|------------|---------------------------------------------------------------|-------|
| 0     | ½ day      | Scaffold (Fumadocs init, Vercel link, KaTeX, Tailwind, theme) | 0     |
| 1     | 2 days     | Landing + Getting Started + Concepts (8) + 2 hero tutorials   | ~15   |
| 2     | 1 week     | Processes: diffusion (30) + jump (16) + noise (5)             | ~50   |
| 3     | 4 days     | Processes: volatility (11) + interest (14) + rough (6)        | ~35   |
| 4     | 4 days     | Distributions (19) + copulas (10)                             | ~30   |
| 5     | 5 days     | Quant: pricing (~30) + calibration (~12)                      | ~45   |
| 6     | 3 days     | Quant: vol surface, risk, credit, curves, bonds, instruments  | ~35   |
| 7     | 3 days     | Stats catalog                                                  | ~30   |
| 8     | 2 days     | AI + viz + Python parity table + remaining tutorials          | ~15   |
| 9     | 2 days     | Bench dashboard + contributing + polish + OG images           | ~10   |

**Cumulative**: ~6 weeks part-time / ~2.5 weeks full-time. **MVP** (phase
0+1) shippable in 2.5 days.

## 10. Maintenance workflow

### 10.1 Per-PR rule

Any PR that adds a public type in
`stochastic-rs-{stochastic,distributions,copulas,quant,stats,ai}` MUST add
a corresponding MDX page. Enforced by `scripts/docs-audit.ts` running in
CI:

1. Parse diff for `pub struct` / `pub enum` additions in `src/`
2. Lookup `module_path` in the existing MDX frontmatter set
3. Fail with a "missing docs page" error if any addition has no page

The audit is also a quarterly job (cron) to catch drift from edits that
slipped past the diff check.

### 10.2 SKILL: docs-writing

Add `.claude/skills/docs-writing/SKILL.md` with the eight page templates
above so a future Claude session can stub a page in seconds when a new
process / distribution / pricer ships. Include:

- Template per category (copy-paste ready)
- Frontmatter required fields
- KaTeX gotchas (escaping, `$$` vs `\\(`)
- Where to add the entry in `meta.json` (sidebar order)

### 10.3 Quarterly content audit

`scripts/docs-audit.ts` also produces a markdown report:

- Pages whose `module_path` no longer resolves (Rust source moved/removed)
- Pages whose pinned sha (§5.2) is stale + Rust source changed
- Public types with no documentation page (cross-checked against §10.1)
- `references` entries with broken DOI / arXiv links

Output: `docs/DOCSITE_AUDIT_<date>.md`, mirroring the existing audit
naming convention.

## 11. Open questions

- **Versioning**: defer until after `2.0.0` stable is shipped. For now,
  `main` only. Old releases reachable via git tags
  (`git checkout v1.x -- website && pnpm dev`). Revisit when `3.x` work
  starts.
- **Domain**: register `stochastic-rs.dev` (or `.org`) before Phase 0
  finishes — DNS propagation lag.
- **Doctest sync** (§5): start §5.2 across the board; promote
  hero pages to §5.1 in Phase 9.
- **Bench dashboard data freshness**: weekly cron sufficient, or
  on-tag only? Default to weekly + on-tag.
- **Search relevance**: default to Fumadocs Orama. If users complain,
  evaluate Pagefind (build-time index) or Algolia DocSearch (free
  for OSS).

## 12. Explicit non-goals

- **Auto-generated rustdoc-style pages**: docs.rs already handles that.
  We hand-curate per the templates in §3 — narrative > exhaustive.
- **i18n**: locked English-only. Reconsider only if a >10% non-English
  user base shows up (very unlikely for a Rust quant lib).
- **Notebook/Jupyter**: out of scope. Tutorials are MDX with `<Tabs>`
  for Rust + Python code; no interactive runtime.
- **Dark-mode-only or light-mode-only**: ship both, system preference
  default (Fumadocs handles this).
- **Marketing site / blog**: this is a docs site. A blog can be added
  later under `/blog/` with `fumadocs-mdx` content collections, but is
  not in this plan.

## 13. References

Internal:

- `CLAUDE.md` — workspace layout, traits, prelude
- `docs/V1_TO_V2.md` — port to `migration/v1-to-v2.mdx`
- `docs/WORKSPACE_MIGRATION.md` — context for the workspace section
- `docs/BENCH_BASELINE.md` — feeds the `<BenchDashboard />`
- `docs/PYTHON_BINDING_UPDATE_PLAN.md` — Python catalog scope
- `.claude/skills/*` — author hints; the docs site links out to selected
  SKILLs from the `contributing/` section

External (framework docs):

- [Fumadocs](https://fumadocs.dev)
- [Fumadocs MDX math setup](https://www.fumadocs.dev/docs/markdown/math)
- [Vercel: Next.js project setup](https://vercel.com/docs/frameworks/nextjs)