Skip to main content

Crate gam

Crate gam 

Source
Expand description

gam is a formula-first generalized additive model engine.

Models are specified with a Wilkinson-style formula DSL and fit by REML / LAML over Gaussian, binomial, Poisson, and Gamma GLMs, plus location-scale, survival, marginal-slope, and response-geometry extensions. Smoothing parameters are selected automatically; posterior sampling uses NUTS where supported with a Gaussian Laplace fallback elsewhere.

§Two interfaces

  • Rust CLI (gam) — fit, predict, report, diagnose, sample, generate. Built from src/main.rs.
  • Python library (gamfit) — PyO3 bindings on top of this crate. See https://gamfit.readthedocs.io/.

§Smooth zoo

Univariate P-splines, multivariate thin-plate, Matérn, and Duchon radial bases, tensor products, and a family of geometric smooths for predictor spaces that are not flat ℝᵈ:

  • 1-D cyclic / periodic B-splines and periodic Duchon
  • Tensor products with one or more periodic margins (cylinder, torus, Möbius)
  • Intrinsic S² smooths (Wahba reproducing kernel + spherical harmonics)
  • Boundary-conditioned (clamped / anchored) 1-D B-splines

scripts/geometric_shapes_demo.py showcases six topologies (trefoil knot, latent-free loop, wobbly cylinder, lumpy sphere, bumpy torus, Möbius strip) recovered from noisy 3-D point clouds, including a self-validating quality report against analytic truth.

§Crate layout

  • families — likelihoods + their analytic gradients / Hessians
  • solver — PIRLS, REML/LAML, and the joint blockwise optimiser
  • terms — formula terms, basis construction, smooth specs
  • inference — prediction, posterior sampling, diagnostics
  • linalg — faer ↔ ndarray bridges + numerics helpers
  • gpu — runtime CUDA dispatch for hot linear algebra paths

Re-exports§

pub use data::encode_recordswith_inferred_schema;
pub use data::load_csvwith_inferred_schema;
pub use data::load_csvwith_schema;
pub use geometry::CircleManifold;
pub use geometry::EuclideanManifold;
pub use geometry::GeodesicIntegrator;
pub use geometry::GeometryError;
pub use geometry::GeometryResult;
pub use geometry::GrassmannManifold;
pub use geometry::ManifoldSpec;
pub use geometry::ProductManifold;
pub use geometry::RiemannianLBFGS;
pub use geometry::RiemannianManifold;
pub use geometry::RiemannianObjective;
pub use geometry::RiemannianTrustRegion;
pub use geometry::SpdManifold;
pub use geometry::SphereManifold;
pub use geometry::StiefelManifold;
pub use geometry::TorusManifold;
pub use gpu::GpuPolicy;
pub use inference::alo;
pub use inference::conformal;
pub use inference::data;
pub use inference::generative;
pub use inference::hmc;
pub use inference::polya_gamma;
pub use inference::predict;
pub use inference::probability;
pub use inference::psis;
pub use inference::quadrature;
pub use inference::sample;
pub use inference::smooth_test;
pub use linalg::faer_ndarray;
pub use linalg::matrix;
pub use linalg::utils;
pub use resource::ByteLruCache;
pub use resource::DerivativeStorageMode;
pub use resource::MaterializationPolicy;
pub use resource::MatrixMaterializationError;
pub use resource::ProblemHints;
pub use resource::ResidentBytes;
pub use resource::ResourcePolicy;
pub use solver::estimate;
pub use solver::gaussian_reml;
pub use solver::pirls;
pub use solver::seeding;
pub use solver::topology_selector;
pub use solver::visualizer;
pub use terms::basis;
pub use terms::construction;
pub use terms::hull;
pub use terms::layout;
pub use terms::smooth;
pub use terms::term_builder;
pub use families::bernoulli_marginal_slope;
pub use families::custom_family;
pub use families::gamlss;
pub use families::survival;
pub use families::survival_construction;
pub use families::survival_location_scale;
pub use families::survival_marginal_slope;
pub use families::survival_predict;
pub use families::transformation_normal;
pub use gpu::GpuDeviceInfo;
pub use solver::protocol::LatentScoreSemantics;
pub use solver::protocol::MarginalSlopeCalibrationProtocol;
pub use solver::protocol::SurvivalMarginalSlopeProtocol;

Modules§

cache
On-disk warm-start cache.
diagnostics
Identifiability-theorem diagnostics — pure numeric kernels.
families
geometry
gpu
GPU acceleration hardware-abstraction layer.
heartbeat
identifiability
Identifiability primitives — frontier 2026 unification.
inference
kernels
Reusable numerical kernels exposed across the gam engine and its Python bindings.
linalg
report
resource
solver
terms
test_support
Generic testing utilities.
types
util

Macros§

analytic_penalty_registry
assert_central_difference_array
Assert that a central difference of an array-producing function matches the analytical derivative.
bail_dim_basis
bail_dim_custom
bail_dim_gamlss
bail_dim_sls
bail_invalid_basis
bail_invalid_estim
bail_invalid_gamlss
bail_invalid_surv
bail_invalid_tnorm
gpu_bail
return Err(GpuError::DriverCallFailed { reason: format!(...) }).
gpu_err
Build a GpuError::DriverCallFailed { reason: format!(...) } value.

Structs§

BernoulliMarginalSlopeFitRequest
BinomialLocationScaleFitRequest
CrossFitScoreCalibration
Out-of-fold Stage-1 latent score and its score-influence Jacobian for a CTN → marginal-slope chain. z_oof (length n) replaces the in-sample z the Stage-2 model consumes; jac_oof (n × p₁) is fed to the Stage-2 spec’s score_influence_jacobian so the joint solve absorbs the realized leakage directions Z_infl = diag(s_f·β̂₀)·J.
CtnStage1Recipe
Internal recipe describing the CTN Stage-1 fit that produced a Stage-2 z column. This is in-process plumbing — never a CLI flag, env var, or feature gate. The orchestration layer populates FitConfig::ctn_stage1 when (and only when) the marginal-slope z was generated by a transformation-normal Stage-1 fit; its presence is the sole auto-enable signal for cross-fitted orthogonalization (design §5). When absent, Stage-2 falls back to the free 1-D score_warp spline (which spans only the x-free leakage column).
FitConfig
Non-formula configuration for model fitting. All fields have sensible defaults.
GaussianLocationScaleFitRequest
LatentBinaryFitRequest
LatentSurvivalFitRequest
LinkWiggleConfig
MaterializedModel
The result of materializing a formula + config against a dataset.
PreparedSurvivalTimeStack
Canonical baseline-time stack shared by the workflow materializer and the CLI survival path (crate::bin::main-side run_survival). Both entry points build the survival time block identically — baseline offsets, derivative guard, optional baseline time-wiggle augmentation — so the assembly lives here once and the CLI consumes it through a thin re-export rather than reconstructing the same decision tree.
StandardBinomialWiggleConfig
Configuration for the second-stage binomial-mean wiggle fit appended to a standard pilot. The blockwise refit options live inside this struct so the pilot config (link_kind + wiggle) and its required refit_options can never disagree: either the whole standard-wiggle request is Some, or it is None. The previous shape had two sibling Option fields on StandardFitRequest, which allowed the materialize path to construct an inconsistent state (#320: linkwiggle config without blockwise options).
StandardFitRequest
StandardFitResult
SurvivalLocationScaleFitRequest
SurvivalLocationScaleFitResult
SurvivalMarginalSlopeFitRequest
SurvivalTransformationFitRequest
SurvivalTransformationFitResult
SurvivalTransformationTermSpec
TransformationNormalFitRequest

Enums§

FitRequest
FitResult
WorkflowError
Typed error category for the solver::workflow materialization and fitting pipeline.

Functions§

fit_from_formula
Parse, materialize, and fit a model in one call.
fit_model
init_parallelism
Initialize faer’s global parallelism backend to a Rayon pool sized at rayon::current_num_threads(). Rayon’s pool itself honors the standard RAYON_NUM_THREADS environment variable on first use, so callers that need to constrain the worker count (e.g. the benchmark harnesses) set it once on the spawned subprocess and rayon picks it up natively.
is_binary_response
Detect whether a response column is binary (0/1 only).
materialize
Parse a formula, resolve it against a dataset, and produce a ready-to-fit FitRequest.
prepare_survival_time_stack
resolve_family
Resolve a family from an optional name, optional link choice, and response data.
resolve_offset_column
resolve_weight_column