1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
//! A rust library for performing GLM regression with data represented in
//! [`ndarray`](file:///home/felix/Projects/ndarray-glm/target/doc/ndarray/index.html)s.
//! The [`ndarray-linalg`](https://docs.rs/ndarray-linalg/) crate is used to allow
//! optimization of linear algebra operations with BLAS.
//!
//! Numerical accuracy cannot be 100% guaranteed but the tests do include several comparisons with
//! R's `glm` and `glmnet` packages, but some cases may not be covered directly or involve inherent
//! ambiguities or imprecisions.
//!
//! # Feature summary:
//!
//! * Many common exponential families (see table below for full list)
//! * Generic over floating-point type
//! * L1 (lasso), L2 (ridge), and elastic net regularization
//! * Statistical tests of fit result
//! * Weighted observations (both frequency and variance)
//! * Alternative and custom link functions
//!
//!
//! # Setting up BLAS backend
//!
//! See the [backend features of
//! `ndarray-linalg`](https://github.com/rust-ndarray/ndarray-linalg#backend-features)
//! for a description of the available BLAS configuartions. You do not need to
//! include `ndarray-linalg` in your crate; simply provide the feature you need to
//! `ndarray-glm` and it will be forwarded to `ndarray-linalg`.
//!
//! Examples using OpenBLAS are shown here. In principle you should also be able to use
//! Netlib or Intel MKL, although these backends are untested.
//!
//! ## System OpenBLAS (recommended)
//!
//! Ensure that the development OpenBLAS library is installed on your system. In
//! Debian/Ubuntu, for instance, this means installing `libopenblas-dev`. Then, put the
//! following into your crate's `Cargo.toml`:
//! ```text
//! ndarray = { version = "0.17", features = ["blas"]}
//! ndarray-glm = { version = "0.1", features = ["openblas-system"] }
//! ```
//!
//! ## Compile OpenBLAS from source
//!
//! This option does not require OpenBLAS to be installed on your system, but the
//! initial compile time will be very long. Use the folling lines in your crate's
//! `Cargo.toml`.
//! ```text
//! ndarray = { version = "0.17", features = ["blas"]}
//! ndarray-glm = { version = "0.1", features = ["openblas-static"] }
//! ```
//!
//! # Examples:
//!
//! Basic linear regression:
//! ```
//! use ndarray_glm::{array, Linear, ModelBuilder};
//!
//! let data_y = array![0.3, 1.3, 0.7];
//! let data_x = array![[0.1, 0.2], [-0.4, 0.1], [0.2, 0.4]];
//! let model = ModelBuilder::<Linear>::data(&data_y, &data_x).build().unwrap();
//! let fit = model.fit().unwrap();
//! // The result is a flat array with the first term as the intercept.
//! println!("Fit result: {}", fit.result);
//! ```
//!
//! L2 regularization:
//! ```
//! use ndarray_glm::{array, Linear, ModelBuilder};
//!
//! let data_y = array![0.3, 1.3, 0.7];
//! let data_x = array![[0.1, 0.2], [-0.4, 0.1], [0.2, 0.4]];
//! // The model builder standardizes the design matrix internally by default, so the penalty
//! // is applied uniformly across features regardless of their scale. The returned coefficients
//! // are always in the original (un-standardized) coordinate system.
//! let model = ModelBuilder::<Linear>::data(&data_y, &data_x).build().unwrap();
//! // L2 (ridge) regularization can be applied with l2_reg().
//! let fit = model.fit_options().l2_reg(1e-5).fit().unwrap();
//! println!("Fit result: {}", fit.result);
//! ```
//!
//! Logistic regression with a non-canonical link function. The fit options may need adjusting
//! as these are typically more difficult to converge:
//! ```
//! use ndarray_glm::{array, Logistic, logistic_link::Cloglog, ModelBuilder};
//!
//! let data_y = array![true, false, false, true, true];
//! let data_x = array![[0.5, 0.2], [0.1, 0.3], [0.2, 0.6], [0.6, 0.3], [0.4, 0.4]];
//! let model = ModelBuilder::<Logistic<Cloglog>>::data(&data_y, &data_x).build().unwrap();
//! let fit = model.fit_options().max_iter(256).l2_reg(0.1).fit().unwrap();
//! println!("Fit result: {}", fit.result);
//! ```
//!
//! # Generalized linear models
//!
//! A generalized linear model (GLM) describes the expected value of a response
//! variable $`y`$ through a *link function* $`g`$ applied to a linear combination of
//! $`K-1`$ feature variables $`x_k`$ (plus an intercept term):
//!
//! ```math
//! g(\text{E}[y]) = \beta_0 + \beta_1 x_1 + \ldots + \beta_{K-1} x_{K-1}
//! ```
//!
//! The right-hand side is the *linear predictor* $`\omega = \mathbf{x}^\top \boldsymbol{\beta}`$.
//! With $`N`$ observations the data matrix $`\mathbf{X}`$ has a leading column of
//! ones (for the intercept) and the model relates the response vector
//! $`\mathbf{y}`$ to $`\mathbf{X}`$ and parameters $`\boldsymbol{\beta}`$.
//!
//! ## Exponential family
//!
//! GLMs assume the response follows a distribution from the exponential family.
//! In canonical form the density is
//!
//! ```math
//! f(y;\eta) = \exp\!\bigl[\eta\, y - A(\eta) + B(y)\bigr]
//! ```
//!
//! where $`\eta`$ is the *natural parameter*, $`A(\eta)`$ is the *log-partition
//! function*, and $`B(y)`$ depends only on the observation.
//! The expected value and variance of $`y`$ follow directly from $`A`$:
//!
//! ```math
//! \text{E}[y] = A'(\eta), \qquad \text{Var}[y] = \phi\, A''(\eta)
//! ```
//!
//! where $`\phi`$ is the *dispersion parameter* (fixed to 1 for some families and estimated from
//! the data for others).
//!
//! ## Link and variance functions
//!
//! The *link function* $`g`$ maps the expected response $`\mu = \text{E}[y]`$ to
//! the linear predictor:
//!
//! ```math
//! g(\mu) = \omega, \qquad \mu = g^{-1}(\omega)
//! ```
//!
//! The *canonical link* is the one for which $`\eta = \omega`$, i.e. the natural
//! parameter equals the linear predictor. In general, in terms of an arbitrary link function $`g`$
//! and canonical link function $`g_0`$, we have $`\eta(\omega) = g_0(g^{-1}(\omega))`$. Non-canonical
//! links are supported in principle (see [`link`]).
//!
//! The *variance function* $`V(\mu)`$ characterizes how the variance of $`y`$
//! depends on the mean, independent of the choice of link:
//!
//! ```math
//! \text{Var}[y] = \phi\, V(\mu)
//! ```
//!
//! ## Supported families
//!
//! | Family | Canonical link | $`V(\mu)`$ | $`\phi`$ |
//! |--------|---------------|-----------|---------|
//! | [`Linear`] (Gaussian) | Identity | $`1`$ | estimated |
//! | [`Logistic`] (Bernoulli) | Logit | $`\mu(1-\mu)`$ | $`1`$ |
//! | [`Poisson`] | Log | $`\mu`$ | $`1`$ |
//! | [`Binomial`] (fixed $`n`$) | Logit | $`\mu(1-\mu/n)`$ | $`1`$ |
//! | [`Exponential`] | $`-1/\mu`$ | $`\mu^2`$ | $`1`$ |
//! | [`Gamma`] | $`-1/\mu`$ | $`\mu^2`$ | estimated |
//! | [`InvGaussian`] (inverse Gaussian) | $`-1/(2\mu^2)`$ | $`\mu^3`$ | estimated |
//!
//! ## Fitting via IRLS
//!
//! Parameters are estimated by maximum likelihood using iteratively reweighted
//! least squares (IRLS). Each step solves for the update
//! $`\Delta\boldsymbol{\beta}`$:
//!
//! ```math
//! -\mathbf{H}(\boldsymbol{\beta}) \cdot \Delta\boldsymbol{\beta} = \mathbf{J}(\boldsymbol{\beta})
//! ```
//!
//! where $`\mathbf{J}`$ and $`\mathbf{H}`$ are the gradient and Hessian of the
//! log-likelihood. With the canonical link and a diagonal variance matrix
//! $`\mathbf{S}`$ ($`S_{ii} = \text{Var}[y^{(i)} | \eta]`$) this simplifies to:
//!
//! ```math
//! (\mathbf{X}^\top \mathbf{S} \mathbf{X})\, \Delta\boldsymbol{\beta}
//! = \mathbf{X}^\top \bigl[\mathbf{y} - g^{-1}(\mathbf{X}\boldsymbol{\beta})\bigr]
//! ```
//!
//! Observation weights $`\mathbf{W}`$ (inverse dispersion per observation)
//! generalize this to:
//!
//! ```math
//! (\mathbf{X}^\top \mathbf{W} \mathbf{S} \mathbf{X})\, \Delta\boldsymbol{\beta}
//! = \mathbf{X}^\top \mathbf{W} \bigl[\mathbf{y} - g^{-1}(\mathbf{X}\boldsymbol{\beta})\bigr]
//! ```
//!
//! The iteration converges when the log-likelihood is concave, which is
//! guaranteed with the canonical link.
//!
//! ## Regularization
//!
//! Optional penalty terms discourage large parameter values:
//!
//! - **L2 (ridge):** adds $`\frac{\lambda_2}{2} \sum |\beta_k|^2`$ to the
//! negative log-likelihood, equivalent to a Gaussian prior on the
//! coefficients.
//! - **L1 (lasso):** adds $`\lambda_1 \sum |\beta_k|`$, implemented via ADMM,
//! which drives small coefficients to exactly zero.
//! - **Elastic net:** combines both L1 and L2 penalties.
//!
//! In all cases the intercept ($`\beta_0`$) is excluded from the penalty.
//!
//! For a more complete mathematical reference, see the
//! [derivation notes](https://felix-clark.github.io/src/tex/glm-math/main.pdf).
// Import some common names into the top-level namespace
pub use ;
// re-export common structs from ndarray
pub use ;