Crate fynch

Crate fynch 

Source
Expand description

§fynch

Fenchel-Young losses and differentiable sorting/ranking.

The name comes from Fenchel-Young losses (Blondel et al. 2020, JMLR), a unifying framework connecting prediction functions and loss functions through convex duality.

§The Fenchel-Young Framework

Given a regularizer Ω, the Fenchel-Young loss is:

L_Ω(θ; y) = Ω*(θ) - ⟨θ, y⟩ + Ω(y)

where Ω* is the Fenchel conjugate. The prediction function is:

ŷ_Ω(θ) = ∇Ω*(θ) = argmax_p { ⟨θ, p⟩ - Ω(p) }

Different regularizers give different behaviors:

Regularizer ΩPrediction ŷ_ΩLoss L_ΩSparsity
Shannon negentropysoftmaxcross-entropyDense
½‖·‖² (squared L2)sparsemaxsparsemax lossSparse
Tsallis α-entropyα-entmaxentmax lossTunable

See the fenchel module for the generic framework.

§Differentiable Sorting/Ranking

Sorting and ranking are discontinuous—small input changes can cause large output changes (rank swaps). This breaks gradient-based optimization.

ApproachModuleRegularizationComplexity
PAVA + SigmoidRootL2O(n) / O(n²)
Sinkhorn OTsinkhornEntropy (Shannon Ω)O(n² × iter)

Sinkhorn sorting is exactly FY with Shannon regularization applied to the permutation polytope (Birkhoff polytope).

§Key Functions

FunctionPurposeModule
pavaIsotonic regressionRoot
soft_rankContinuous ranksRoot
soft_sortContinuous sortingRoot
fenchel::softmaxDense predictionfenchel
fenchel::sparsemaxSparse predictionfenchel
fenchel::entmaxTunable sparsityfenchel

§Quick Start

§Fenchel-Young Predictions

use fynch::fenchel::{softmax, sparsemax, entmax};

let theta = [2.0, 1.0, 0.1];

// Dense (softmax)
let p_soft = softmax(&theta);
assert!(p_soft.iter().all(|&x| x > 0.0));

// Sparse (sparsemax)
let p_sparse = sparsemax(&theta);
assert!(p_sparse.iter().any(|&x| x == 0.0));

// Tunable (1.5-entmax)
let p_ent = entmax(&theta, 1.5);

§Fenchel-Young Losses

use fynch::fenchel::{Regularizer, Shannon, SquaredL2, Tsallis};

let theta = [2.0, 1.0, 0.1];
let y = [1.0, 0.0, 0.0];  // one-hot target

// Cross-entropy (Shannon)
let loss_ce = Shannon.loss(&theta, &y);

// Sparsemax-style FY loss via squared L2 regularizer
let loss_sp = SquaredL2.loss(&theta, &y);

// 1.5-entmax loss
let loss_ent = Tsallis::entmax15().loss(&theta, &y);

§Differentiable Sorting

use fynch::{pava, soft_rank, soft_sort};

// PAVA: isotonic regression
let y = [3.0, 1.0, 2.0, 5.0, 4.0];
let monotonic = pava(&y);  // [2.0, 2.0, 2.0, 4.5, 4.5]

// Soft ranking
let scores = [0.5, 0.2, 0.8, 0.1];
let ranks = soft_rank(&scores, 0.1).unwrap();

§Learning to Rank

use fynch::loss::{spearman_loss, listnet_loss};

let pred = [0.9, 0.1, 0.5];
let target = [3.0, 1.0, 2.0];
let loss = spearman_loss(&pred, &target, 0.1);

§Modules

ModuleContents
fenchelGeneric FY framework: regularizers, predictions, losses
sinkhornEntropic OT for soft permutations
lossLearning-to-rank losses
metricsIR evaluation: MRR, NDCG, Hits@k

§Connections

§What Can Go Wrong

  1. Temperature too low: Numerical instability, vanishing gradients
  2. Temperature too high: Predictions become uniform, lose signal
  3. Sinkhorn not converging: Increase max_iter or epsilon
  4. Wrong regularizer: Use sparsemax for top-k, softmax for soft attention

§References

  • Blondel, Martins, Niculae (2020). “Learning with Fenchel-Young Losses” (JMLR)
  • Martins & Astudillo (2016). “From Softmax to Sparsemax”
  • Blondel et al. (2020). “Fast Differentiable Sorting and Ranking”
  • Cuturi et al. (2019). “Differentiable Ranking via Optimal Transport”

Re-exports§

pub use fenchel::entmax;
pub use fenchel::entropy_bits;
pub use fenchel::entropy_nats;
pub use fenchel::softmax;
pub use fenchel::softmax_with_temperature;
pub use fenchel::sparsemax;
pub use fenchel::Regularizer;
pub use fenchel::Shannon;
pub use fenchel::SquaredL2;
pub use fenchel::Tsallis;
pub use metrics::compute_rank;
pub use metrics::hits_at_k;
pub use metrics::mean_rank;
pub use metrics::mrr;
pub use metrics::ndcg;
pub use metrics::ndcg_at_k;
pub use metrics::RankingMetrics;
pub use sigmoid::sigmoid;
pub use sigmoid::sigmoid_derivative;
pub use sinkhorn::sinkhorn_rank;
pub use sinkhorn::sinkhorn_sort;
pub use sinkhorn::SinkhornConfig;
pub use topk::differentiable_bottomk;
pub use topk::differentiable_topk;

Modules§

fenchel
Fenchel-Young losses: a unified framework for prediction functions and losses.
loss
Differentiable ranking losses for learning to rank.
metrics
Ranking evaluation metrics.
sigmoid
Sigmoid function for smooth approximations.
sinkhorn
Sinkhorn algorithm for entropic optimal transport.
topk
Differentiable Top-K selection.

Enums§

Error

Functions§

fast_soft_sort
Fast Differentiable Sorting ($O(n \log n)$).
isotonic_l2
Isotonic regression with L2 loss.
pava
Pool Adjacent Violators Algorithm (PAVA) for isotonic regression.
pava_weighted
Weighted PAVA with custom weights.
reciprocal_rank_fusion
Reciprocal Rank Fusion (RRF).
soft_rank
Soft ranking with temperature parameter.
soft_sort
Soft sorting with temperature parameter.
soft_topk_indicator
Compute soft top-k indicator.

Type Aliases§

Result