pub fn sparsemap_gradient<F>(
result: &SparsemapResult<F>,
upstream_grad: &[F],
) -> Vec<F>Expand description
Compute gradient of a loss through SparseMAP via the active-set theorem.
For SparseMAP, the Jacobian of the optimal solution μ*(θ) w.r.t. θ is:
dμ*/dθ = Π_S (projection onto tangent space of active support S)Concretely, for the simplex case, only active coordinates (support) can receive gradient. The backward pass is:
dL/dθ = Π_S (upstream_grad)
= upstream_grad[support] - mean(upstream_grad[support]) · 1_SThis is the projection of upstream_grad onto the tangent space of the
simplex face defined by the active support.
§Arguments
result– the forward-passSparsemapResult.upstream_grad– gradient of the scalar loss w.r.t.solution(∂L/∂μ).
§Returns
Gradient ∂L/∂θ of the same length as the score input.