# Note: Overdispersion and Numerical Stability in Poisson Models
This note covers two distinct issues related to Poisson models: statistical overdispersion and numerical instability in the IRLS algorithm.
---
## Part 1: Statistical Modeling of Overdispersed Count Data
### The Issue: The Poisson Assumption
A common issue in Bayesian models using a Poisson likelihood is the distribution's inherent assumption that the mean of the count data is equal to its variance (E[Y] = Var(Y) = λ) [^1]. This assumption is often violated in real-world datasets, where the observed variance is significantly larger than the mean. This phenomenon is known as **overdispersion** [^2].
### Consequences of Ignoring Overdispersion
Failing to account for overdispersion leads to several problems:
- The model will be a poor fit for the data.
- The model will underestimate the true variability, resulting in posterior distributions for parameters that are artificially narrow .
- This can lead to erroneously high confidence in parameter estimates and an increased rate of false positives (Type I errors) when conducting hypothesis tests or evaluating variable significance .
### The Solution: The Negative Binomial Model
The standard and most effective solution to handle overdispersion is to replace the Poisson likelihood with a **Negative Binomial (NB) likelihood** [^5]. In a Bayesian context, this is often formulated as a **Gamma-Poisson mixture model** [^7]. This approach provides more reliable parameter estimates and more accurate credible intervals for overdispersed count data [^9][^10].
---
## Part 2: Numerical Instability in GLM Poisson IRLS
### The Issue: IRLS Convergence Failure
The Iteratively Reweighted Least Squares (IRLS) algorithm used to fit the Poisson Generalized Linear Model (GLM) can fail to converge. This is a numerical optimization issue, separate from the statistical issue of overdispersion.
The problem arises when the predicted mean value, `mu`, approaches zero during an iteration. For the Poisson GLM with a canonical log-link, `mu = exp(X * beta)`. The calculation of the working response and weights involves division by `mu` (since the variance of a Poisson distribution is `mu`). If `mu` becomes zero or numerically indistinguishable from zero, this leads to division-by-zero errors, resulting in `NaN` or `Inf` values and causing the algorithm to fail.
### Suggested Fix: Epsilon-Smoothing for Numerical Stability
A standard and effective technique to prevent this failure is to add a small constant, or "epsilon" (e.g., `1e-8`), to the predicted mean `mu` within the IRLS loop.
**Proposed Change:**
Modify the calculation of `mu` inside the IRLS algorithm to be:
`let mu = eta.exp() + EPSILON;`
This technique, known as epsilon-smoothing, ensures that `mu` is always positive and non-zero, preventing floating-point errors. The `EPSILON` value should be small enough not to introduce significant bias into the final coefficient estimates but large enough to maintain numerical stability. This simple change can dramatically improve the robustness and convergence rate of the IRLS algorithm for the Poisson family.
---
## Peer-Reviewed Sources