gam 0.3.55

Generalized penalized likelihood engine
Documentation

gam · gamfit

PyPI Python Docs Rust CI License

A formula-first generalized additive model engine. Written in Rust, with a Python library on top.

Fits Gaussian, binomial, Poisson, and Gamma GLMs with smooth terms, random effects, bounded/constrained coefficients, location-scale extensions, survival likelihoods, and flexible/learnable link functions. Smoothing parameters are selected by REML or LAML. Posterior sampling uses NUTS where supported, and Gaussian Laplace for model classes that do not have an exact NUTS path.

Docs: https://gamfit.readthedocs.io/ · PyPI: https://pypi.org/project/gamfit/

3D Matérn fit on a noisy 2-D landscape

Two ways to use it

# Python library (gamfit)
import gamfit
model = gamfit.fit(train, "y ~ s(x) + group(site)")
preds = model.predict(test, interval=0.95)
# Rust CLI (gam)
gam fit data.csv 'y ~ smooth(x) + group(site)' --out model.json
gam predict model.json new_data.csv --uncertainty
gam report model.json data.csv

Both share one engine, one formula DSL, and one on-disk format. Train in the CLI and score in Python, or vice versa.

Install

Python. Wheels for Linux (x86_64, aarch64), macOS (Intel + Apple silicon), and Windows. No Rust toolchain required.

uv add gamfit
# or
pip install gamfit

Optional extras: gamfit[pandas], gamfit[plot], gamfit[sklearn], gamfit[torch], gamfit[all].

Rust CLI. One-liner installer for macOS, Linux, and Windows Git Bash:

curl -fsSL https://raw.githubusercontent.com/SauersML/gam/main/install.sh | bash

Or build from source: cargo build --release — the binary lands at ./target/release/gam.

What makes it different

Features that other GAM libraries don't combine in one place.

Three-part penalty structure

Each smooth gets independent penalties on magnitude, gradient, and curvature. Most libraries use one (curvature only) or two. Keeping them separate penalizes a flat-but-offset function differently from a wiggly one.

gamfit.fit(df, "z ~ duchon(pc1, pc2, pc3, pc4, centers=50)")

Adaptive per-axis anisotropy

Surface smooths learn how much to shrink each axis independently, so (latitude, age, log_income) doesn't share a single length-scale.

gamfit.fit(df, "z ~ matern(pc1, pc2, pc3, pc4)", scale_dimensions=True)

Surface smooths in arbitrary dimension

P-spline, thin-plate, Matérn, and Duchon radial bases. Duchon uses triple-operator regularization (mass + tension + stiffness) and is scale-free by default. Kernels and scaling regimes can be mixed.

gamfit.fit(df, "y ~ matern(x1, x2, x3, nu=5/2)")
gamfit.fit(df, "y ~ duchon(x1, x2, x3, x4, centers=80)")
gamfit.fit(df, "y ~ te(space, time, k=10)")   # tensor product

Geometric / manifold smooths

When the predictor space is a circle, cylinder, torus, sphere, or Möbius strip, ordinary smooths produce visible seams and pole artefacts. gamfit's geometric smooths bake the wrap topology into both the basis and the penalty — periodic=[axes], period=[...] margins for tensor products, an intrinsic sphere(...) kernel (Wahba's reproducing kernel or spherical harmonics), and boundary-conditioned 1-D B-splines.

rotating recovery of six manifolds (trefoil knot, latent-free loop, wobbly cylinder, lumpy sphere, bumpy torus, Möbius strip) from noisy 3-D point clouds

Each pair shows a noisy 3-D point cloud (left) and the smooth manifold the geometric smooths recover (right). The full gallery, formulas, and reproduction script live in docs/manifold-smooths.md.

gamfit.fit(df, "y ~ s(theta, periodic=true, period=2*pi)")             # circle
gamfit.fit(df, "y ~ te(theta, h, periodic=[0], period=[2*pi, None])")  # cylinder
gamfit.fit(df, "y ~ te(u, v, periodic=[0,1], period=[2*pi, 2*pi])")    # torus
gamfit.fit(df, "y ~ sphere(lat, lon, radians=true)")                   #gamfit.fit(df, "y ~ s(x, bc=clamped)")                                 # zero-slope endpoints

Flexible / learnable link functions

A spline offset on top of a base link lets the data correct for link misspecification. blended(logit, probit) learns a mixture; sas and beta-logistic learn shape parameters from the data.

gamfit.fit(df, "case ~ s(age) + link(type=flexible(probit))"
                 " + linkwiggle(internal_knots=6)")

Marginal-slope models

For binary or survival outcomes with a calibrated risk score (e.g. a polygenic score), put baseline risk and score effect in separate formulas. The slope on the score is a smooth function of covariate space, so the baseline can't absorb signal that belongs to it.

gamfit.fit(
    df,
    "case ~ matern(pc1, pc2, pc3)",
    family="bernoulli-marginal-slope",
    link="probit",
    z_column="pgs_z",
    logslope_formula="matern(pc1, pc2, pc3)",
)

two-surface marginal-slope viz over a joint Duchon smooth

Two predicted-probability surfaces over the same (pc1, pc2) plane — one at z = 0, one at z = +2. The vertical gap is the risk difference for that score contrast; the modeled score effect is the smooth probit-scale slope.

Survival with on-demand surfaces

Surv(entry, exit, event) + four likelihood modes (transformation, Weibull, location-scale, marginal-slope) + a SurvivalPrediction object that evaluates S(t), h(t), H(t) on any time grid:

pred = model.predict(test_df)
S = pred.survival_at([1, 5, 10, 20])     # (n_rows, 4)
H = pred.cumulative_hazard_at([10])      # (n_rows, 1)

For population-scale cohorts, stream to CSV without materialising the full matrix: pred.write_survival_at_csv("surv.csv", times=[...]).

NUTS posteriors

model.sample(...) runs the No-U-Turn Sampler over the coefficient posterior conditional on the fitted smoothing parameters. Predictive bands stream in row chunks, so memory stays bounded on large test sets.

posterior = model.sample(train, seed=42)
bands = posterior.predict(test, level=0.95)
# eta_mean, eta_lower, eta_upper, mean, mean_lower, mean_upper

Bounded coefficients with informative priors

Hard interval transforms with optional Beta priors, for proportions, mixing weights, or any coefficient that must live in [a, b]:

gamfit.fit(df,
    "y ~ age + bounded(prop, min=0, max=1, target=0.5, strength=3)")

scikit-learn drop-in

from gamfit.sklearn import GAMRegressor
est = GAMRegressor(formula="y ~ s(x)").fit(X, y)

Where to learn more

  • Python documentation: https://gamfit.readthedocs.io/ — getting started, the formula DSL, families and links, survival, marginal-slope, posterior sampling, scikit-learn integration, a runnable cookbook, and an auto-generated API reference.
  • CLI help: gam <command> --help (commands: fit, predict, report, diagnose, sample, generate).
  • Cookbook: docs/cookbook.md.
  • Manifold smooths gallery: docs/manifold-smooths.md — visual tour of the periodic / sphere / cylinder / torus / Möbius / BC smooths, recovered from noisy 3-D point clouds.

Repository layout

Path Contents
src/ Rust engine: fitting, inference, smooth construction, survival, CLI.
crates/gam-pyffi/ PyO3 bindings (the gamfit._rust native extension).
gamfit/ Pure-Python public API on top of the bindings.
docs/ MkDocs/Material documentation sources (built to RTD).
tests/ Rust + Python integration tests.
bench/ Benchmark harness, scenario configs, datasets, plots.
scripts/ Runnable demo / diagnostic / utility scripts (the manifold smooths gallery lives here).

Development

# Rust
cargo fmt --all
cargo clippy --all-targets --all-features -- -A warnings -D clippy::correctness -D clippy::suspicious
cargo test --all-features

# Python docs (uses uv)
uv venv --python 3.12 .venv-docs
uv pip install --python .venv-docs/bin/python -r docs/requirements.txt
.venv-docs/bin/mkdocs serve

Benchmark suite: python3 bench/run_suite.py --help.

Issues, feedback, contributions

Open a GitHub issue for bug reports, feature requests, or questions — including "this doesn't work the way I expect."

License

AGPL-3.0-or-later. See LICENSE.