semicontinuous_models 0.1.1

Reusable models for semi-continuous outcomes (two-part, Tweedie, and log-normal)
Documentation

semicontinuous_models

semicontinuous_models provides reusable model implementations for semi-continuous outcomes:

  • Two-part model (logit + gamma log link)
  • Tweedie GLM (quasi-likelihood)
  • Log-normal regression with smearing retransformation
  • Selection and comparison utilities (Park test, AIC/BIC, cross-validation)

The crate was originally developed for a healthcare outcome study, but the API is domain-agnostic and can be reused for any non-negative semi-continuous response.

Installation

[dependencies]
semicontinuous_models = "0.1"

Quick start

use semicontinuous_models::{FitOptions, ModelInput, fit_two_part_input};
use faer::Mat;

fn idx_to_f64(idx: usize) -> f64 {
    f64::from(u32::try_from(idx).unwrap_or(u32::MAX))
}

let n = 40;
let design_matrix = Mat::from_fn(n, 2, |i, j| if j == 0 { 1.0 } else { idx_to_f64(i) / 10.0 });
let outcome = Mat::from_fn(n, 1, |i, _| if i % 4 == 0 { 0.0 } else { 1.0 + 0.1 * idx_to_f64(i) });
let input = ModelInput::new(design_matrix, outcome);

let (model, report) = fit_two_part_input(&input, FitOptions::default()).expect("fit");
let prediction = model.predict(&input.design_matrix);

assert_eq!(prediction.expected_outcome.nrows(), n);
assert!(report.iterations_logit > 0);

For more numerically stable defaults on observational data:

use semicontinuous_models::{FitOptions, LogNormalOptions};

let two_part_options = FitOptions::stable_defaults();
let lognormal_options = LogNormalOptions::stable_defaults();

Public entry points

  • fit_two_part_input
  • fit_two_part_weighted_input
  • fit_two_part_clustered_input
  • fit_tweedie_input
  • fit_lognormal_smearing_input
  • select_models_input
  • cross_validate_models_input
  • compare_models_input

Preprocess helpers:

  • drop_constant_columns
  • column_has_variation
  • outcome_diagnostics

Data contract

  • Provide an intercept column explicitly if you want one.
  • Outcomes are expected to be non-negative and shaped as n x 1.
  • Weights (if provided) must be strictly positive and shaped as n x 1.
  • Cluster labels (if provided) must have length n.

Examples

From crate root:

cargo run --example two_part
cargo run --example tweedie
cargo run --example selection
cargo run --example compare_models