Skip to main content

Module nl_hessian_program

Module nl_hessian_program 

Source
Expand description

Precompiled symbolic-Hessian program for one Tape.

Tape::hessian_accumulate runs forward-over-reverse AD at every call: for each tape variable j it (a) match-dispatches every op in a forward-tangent sweep, (b) zeros adj/adj_dot, (c) match-dispatches every op again in the reverse-over-tangent sweep, and (d) HashMap-looks-up every Var(k) slot to find its Hessian output position. On evaluator-bound problems (dirichlet, lane_emden, henon) that match-dispatch + symbolic-AD overhead is ~80% of total CPU.

This module compiles all of that ONCE at tape-build time into a flat Vec<HOp> of pre-resolved primitive ops:

  • Forward pass — one Fwd* op per TapeOp. Mirrors Tape::forward.
  • Per-j forward tangent — only the ops touching slots that statically depend on j are emitted (the rest stay zero from the per-j ZeroRange reset).
  • Per-j reverse-over-tangent — only ops on slots reachable backward from output, with all slot indices and Hessian output pointers pre-resolved.

§Scratch layout

The program reads/writes a single &mut [f64] arena of n_slots cells. We allocate four contiguous regions of length n (n = tape.ops.len()):

  • v[i] in slot i
  • dot[i] in slot n + i
  • adj[i] in slot 2n + i
  • adj_dot[i] in slot 3n + i

Per-j setup zeros the [n, 4n) range and seeds adj[n-1]. Allocation pattern is intentionally trivial — finer-grained slot recycling buys little once the dispatch loop is the bottleneck, and a contiguous layout makes the per-j ZeroRange reset a single memset-friendly loop.

Structs§

HessianProgram
Precompiled Hessian-of-one-tape program. Built once via HessianProgram::compile; executed many times.

Enums§

HOp
One primitive operation in the compiled Hessian program. dst/a/b/etc. are u32 offsets into the caller’s scratch slice; see the module docs for the slot layout.