Skip to main content

Crate ferric

Crate ferric 

Source
Expand description

Ferric is a small probabilistic programming language embedded in Rust.

You write a model with make_model!, using ordinary Rust expressions for deterministic dependencies and Ferric distributions for stochastic random variables. The macro expands to a Rust module containing a Model type, query sample types, and samplers.

§Minimal Example

use ferric::make_model;

make_model! {
    name coin;
    use ferric::distributions::Bernoulli;

    const draws : u64;

    let fair : bool ~ Bernoulli::new(0.5);
    let draw[trial of draws] : bool ~ if fair {
        Bernoulli::new(0.5)
    } else {
        Bernoulli::new(0.8)
    };
    let heads : u64 = draw.iter().filter(|&&is_head| is_head).count() as u64;

    observe heads;
    query fair;
}

let model = coin::Model {
    draws: 6,
    heads: 5,
};
let num_samples = 100;
let mut fair_count = 0;
for sample in model.sample_iter().take(num_samples) {
    if sample.fair {
        fair_count += 1;
    }
}
let prob_fair = fair_count as f64 / num_samples as f64;
assert!((0.0..=1.0).contains(&prob_fair));

§Language Overview

A model starts with name model_name;, optional use statements, optional constants, then variable declarations, observations, and queries.

make_model! {
    name my_model;
    use ferric::distributions::Normal;

    const known_value : f64;

    let latent : f64 ~ Normal::new(0.0, 1.0);
    let measured : f64 ~ Normal::new(latent, known_value);

    observe measured;
    query latent;
}

Constants become public fields on the generated Model. Observed variables also become public fields and must be supplied when constructing the model:

let model = my_model::Model {
    known_value: 0.25,
    measured: 1.2,
};

§Stochastic And Deterministic Variables

Use ~ for a stochastic variable drawn from a distribution:

let x : f64 ~ Normal::new(0.0, 1.0);

Use = for a deterministic variable:

let shifted : f64 = x + 3.0;

Dependencies are Rust expressions. Earlier variables and constants may be referenced by name; Ferric rewrites those references into generated model evaluation calls. Distribution constructors usually return Result, so Ferric-generated code unwraps them after your model expression is evaluated.

§Observations And Queries

observe variable; conditions on a value supplied in the generated Model. query variable; includes a variable in each returned sample.

Rejection sampling is available through sample_iter() and is only appropriate when all observations are discrete. Self-normalised importance sampling is available through weighted_sample_iter() when every observed variable is stochastic:

use ferric::{make_model, weighted_mean};

make_model! {
    name noisy_coin;
    use ferric::distributions::Bernoulli;

    let fair : bool ~ Bernoulli::new(0.5);
    let reported : bool ~ if fair {
        Bernoulli::new(0.9)
    } else {
        Bernoulli::new(0.1)
    };

    observe reported;
    query fair;
}

let model = noisy_coin::Model { reported: true };
let mut values = Vec::new();
let mut weights = Vec::new();
for sample in model.weighted_sample_iter().take(100) {
    values.push(sample.sample.fair as u8 as f64);
    weights.push(sample.log_weight);
}
let posterior_mean = weighted_mean(&values, &weights);
let ess = ferric::effective_sample_size(&weights);
assert!((0.0..=1.0).contains(&posterior_mean));
assert!(ess > 0.0);

User-proposal importance sampling is available through importance_sampler::<P>(). Each generated model module includes an ObservedData struct, a Proposal struct, and a Proposer<R> trait. Ferric calls Proposer::new(&ObservedData) once before sampling so the proposer can build proposal distributions from constants and observations. Each call to Proposer::propose returns proposed latent stochastic values and their joint proposal log_prob; omitted proposal fields are sampled from the model prior. Ferric then computes log p_model(proposed values) - log q(proposed values) and adds the usual observation log likelihoods. For diagnostics, generated models also provide importance_sampler_debug::<P>(n), which prints the proposal, prior terms for proposed values, observed likelihood terms, sampled stochastic values, and final log weight for the first n worlds. Use effective_sample_size on the collected log weights to monitor weight degeneracy. The rats example wires debugging to FERRIC_DEBUG_IMPORTANCE; for example, FERRIC_DEBUG_IMPORTANCE=1 cargo run -p ferric --example rats traces the first importance sample in each rats experiment.

§Indexed Random Variables

Ferric supports one or more dimensions of indexed random variables. Each dimension is written as name of upper, where name is the local index variable and upper is a previously declared constant or variable. The index takes values from 0 through upper - 1.

const n : u64;
const t : u64;

let survival : f64 ~ Beta::new(99.0, 1.0);
let alive[person of n, time of t] : bool ~ if time == 0 {
    Bernoulli::new(1.0)
} else if alive[person, time - 1] {
    Bernoulli::new(survival)
} else {
    Bernoulli::new(0.0)
};
let age[person of n] : u64 = {
    let mut age = t;
    for time in 0..t {
        if !alive[person, time] {
            age = time;
            break;
        }
    }
    age
};
observe age;
query survival;

Indexed query values are nested Vecs. Indexed observations are nested Vec<Option<T>>; Some(value) observes that entry and None masks it as missing.

§Random Lengths And max

An indexed variable can be bounded by a stochastic integer-valued variable:

const max_n : u64;

let n : u64 ~ ferric::distributions::Poisson::new(4.0) max max_n;
let flips[flip of n] : bool ~ ferric::distributions::Bernoulli::new(0.5);
let heads : u64 = flips.iter().filter(|&&x| x).count() as u64;

observe heads;
query n;

The max annotation is a bounded-domain declaration. For example n ~ Poisson::new(3.0) max 100 means the domain of n is 0..=100; values above 100 are outside the model. Ferric normalizes bounded likelihoods by subtracting distributions::Distribution::log_cum_prob, the log CDF at the bound. Generated worlds cache that value so each bounded variable value computes the normalization term once per sampled world state.

§Distributions

Built-in scalar, vector, and matrix distributions live in distributions. Common choices include Bernoulli, Binomial, Categorical, Poisson, DiscreteUniform, Normal, Gamma, Beta, Dirichlet, Multinomial, MultivariateNormal, MatrixNormal, and Wishart.

Each distribution page documents its parameters, support, sampling behavior, and log probability.

§Worked Examples

The repository examples show complete models:

See the README for release notes, publishing notes, and a shorter tour.

Re-exports§

pub use self::core::FeOption;
pub use FeOption::Known;
pub use FeOption::Null;
pub use FeOption::Unknown;

Modules§

core
distributions
Probability distributions available to Ferric models.

Macros§

make_model

Traits§

MaskedEq
Mask-aware equality for generated observation checks.

Functions§

effective_sample_size
Compute the effective sample size (ESS) of unnormalised log weights.
weighted_mean
Compute the self-normalised importance-weighted mean of values.
weighted_std
Compute the self-normalised importance-weighted standard deviation of values.