pub trait DifferentiablePolicy<S>: Policy<S> {
    fn grad_log(&self, input: &S, a: Self::Action) -> Matrix<f64>;
}

Required Methods§

Compute the derivative of the log probability for a single action.

Implementors§