Crate fynch

Expand description

§fynch

Fenchel-Young losses and differentiable sorting/ranking.

The name comes from Fenchel-Young losses (Blondel et al. 2020, JMLR), a unifying framework connecting prediction functions and loss functions through convex duality.

§The Fenchel-Young Framework

Given a regularizer Ω, the Fenchel-Young loss is:

L_Ω(θ; y) = Ω*(θ) - ⟨θ, y⟩ + Ω(y)

where Ω* is the Fenchel conjugate. The prediction function is:

ŷ_Ω(θ) = ∇Ω*(θ) = argmax_p { ⟨θ, p⟩ - Ω(p) }

Different regularizers give different behaviors:

Regularizer Ω	Prediction ŷ_Ω	Loss L_Ω	Sparsity
Shannon negentropy	softmax	cross-entropy	Dense
½‖·‖² (squared L2)	sparsemax	sparsemax loss	Sparse
Tsallis α-entropy	α-entmax	entmax loss	Tunable

See the fenchel module for the generic framework.

§Differentiable Sorting/Ranking

Sorting and ranking are discontinuous—small input changes can cause large output changes (rank swaps). This breaks gradient-based optimization.

Approach	Module	Regularization	Complexity
PAVA + Sigmoid	Root	L2	O(n) / O(n²)
Sinkhorn OT	`sinkhorn`	Entropy (Shannon Ω)	O(n² × iter)

Sinkhorn sorting is exactly FY with Shannon regularization applied to the permutation polytope (Birkhoff polytope).

§Key Functions

Function	Purpose	Module
`pava`	Isotonic regression	Root
`soft_rank`	Continuous ranks	Root
`soft_sort`	Continuous sorting	Root
`fenchel::softmax`	Dense prediction	`fenchel`
`fenchel::sparsemax`	Sparse prediction	`fenchel`
`fenchel::entmax`	Tunable sparsity	`fenchel`

§Quick Start

§Fenchel-Young Predictions

use fynch::fenchel::{softmax, sparsemax, entmax};

let theta = [2.0, 1.0, 0.1];

// Dense (softmax)
let p_soft = softmax(&theta);
assert!(p_soft.iter().all(|&x| x > 0.0));

// Sparse (sparsemax)
let p_sparse = sparsemax(&theta);
assert!(p_sparse.iter().any(|&x| x == 0.0));

// Tunable (1.5-entmax)
let p_ent = entmax(&theta, 1.5);

§Fenchel-Young Losses

use fynch::fenchel::{Regularizer, Shannon, SquaredL2, Tsallis};

let theta = [2.0, 1.0, 0.1];
let y = [1.0, 0.0, 0.0];  // one-hot target

// Cross-entropy (Shannon)
let loss_ce = Shannon.loss(&theta, &y);

// Sparsemax-style FY loss via squared L2 regularizer
let loss_sp = SquaredL2.loss(&theta, &y);

// 1.5-entmax loss
let loss_ent = Tsallis::entmax15().loss(&theta, &y);

§Differentiable Sorting

use fynch::{pava, soft_rank, soft_sort};

// PAVA: isotonic regression
let y = [3.0, 1.0, 2.0, 5.0, 4.0];
let monotonic = pava(&y);  // [2.0, 2.0, 2.0, 4.5, 4.5]

// Soft ranking
let scores = [0.5, 0.2, 0.8, 0.1];
let ranks = soft_rank(&scores, 0.1).unwrap();

§Learning to Rank

use fynch::loss::{spearman_loss, listnet_loss};

let pred = [0.9, 0.1, 0.5];
let target = [3.0, 1.0, 2.0];
let loss = spearman_loss(&pred, &target, 0.1);

§Modules

Module	Contents
`fenchel`	Generic FY framework: regularizers, predictions, losses
`sinkhorn`	Entropic OT for soft permutations
`loss`	Learning-to-rank losses
`metrics`	IR evaluation: MRR, NDCG, Hits@k

§Connections

wass: Sinkhorn OT is the same algorithm
logp: Shannon entropy connects to information theory
information retrieval ecosystem: More comprehensive IR evaluation

§What Can Go Wrong

Temperature too low: Numerical instability, vanishing gradients
Temperature too high: Predictions become uniform, lose signal
Sinkhorn not converging: Increase max_iter or epsilon
Wrong regularizer: Use sparsemax for top-k, softmax for soft attention

§References

Blondel, Martins, Niculae (2020). “Learning with Fenchel-Young Losses” (JMLR)
Martins & Astudillo (2016). “From Softmax to Sparsemax”
Blondel et al. (2020). “Fast Differentiable Sorting and Ranking”
Cuturi et al. (2019). “Differentiable Ranking via Optimal Transport”

Re-exports§

pub use fenchel::entmax;
pub use fenchel::entropy_bits;
pub use fenchel::entropy_nats;
pub use fenchel::softmax;
pub use fenchel::softmax_with_temperature;
pub use fenchel::sparsemax;
pub use fenchel::Regularizer;
pub use fenchel::Shannon;
pub use fenchel::SquaredL2;
pub use fenchel::Tsallis;
pub use metrics::compute_rank;
pub use metrics::hits_at_k;
pub use metrics::mean_rank;
pub use metrics::mrr;
pub use metrics::ndcg;
pub use metrics::ndcg_at_k;
pub use metrics::RankingMetrics;
pub use sigmoid::sigmoid;
pub use sigmoid::sigmoid_derivative;
pub use sinkhorn::sinkhorn_rank;
pub use sinkhorn::sinkhorn_sort;
pub use sinkhorn::SinkhornConfig;
pub use topk::differentiable_bottomk;
pub use topk::differentiable_topk;

Modules§

fenchel: Fenchel-Young losses: a unified framework for prediction functions and losses.
loss: Differentiable ranking losses for learning to rank.
metrics: Ranking evaluation metrics.
sigmoid: Sigmoid function for smooth approximations.
sinkhorn: Sinkhorn algorithm for entropic optimal transport.
topk: Differentiable Top-K selection.

Enums§

Error

Functions§

fast_soft_sort: Fast Differentiable Sorting ($O(n \log n)$).
isotonic_l2: Isotonic regression with L2 loss.
pava: Pool Adjacent Violators Algorithm (PAVA) for isotonic regression.
pava_weighted: Weighted PAVA with custom weights.
reciprocal_rank_fusion: Reciprocal Rank Fusion (RRF).
soft_rank: Soft ranking with temperature parameter.
soft_sort: Soft sorting with temperature parameter.
soft_topk_indicator: Compute soft top-k indicator.

Type Aliases§

Result

Crate fynch

Crate fynch Copy item path

§fynch

§The Fenchel-Young Framework

§Differentiable Sorting/Ranking

§Key Functions

§Quick Start

§Fenchel-Young Predictions

§Fenchel-Young Losses

§Differentiable Sorting

§Learning to Rank

§Modules

§Connections

§What Can Go Wrong

§References

Re-exports§

Modules§

Enums§

Functions§

Type Aliases§

Crate fynch