Write-into variant of solve_spd_pcg. Matches solve_spd_pcg’s return
shape but takes an apply closure that writes its result into a caller
buffer, enabling the inner-Newton PCG hot path to avoid per-iter
Array1::<f64> allocations for the matvec output (biobank-scale critical).
Write-into variant of solve_spd_pcg_with_info that takes an apply closure
of the form Fn(&Array1<f64>, &mut Array1<f64>) so the matvec can write into
a caller-owned buffer. This eliminates the per-iteration Array1::<f64>
allocation for the matvec result that the legacy closure-returning variant
forces. See commit 83369abb for the analogous penalty-vector elimination.