Struct PointerNetwork

Source

pub struct PointerNetwork {
    pub hidden_dim: usize,
    pub attn_dim: usize,
    pub input_dim: usize,
    pub w1: Vec<f64>,
    pub w2: Vec<f64>,
    pub v: Vec<f64>,
    pub enc_wx: Vec<f64>,
    pub enc_wh: Vec<f64>,
    pub enc_b: Vec<f64>,
}

Expand description

A Pointer Network attention head with optional Elman encoder.

Parameter layout (all row-major, f64):

w1[a * hidden_dim + h] — encoder projection (attn_dim × hidden_dim)
w2[a * hidden_dim + h] — decoder-query projection (attn_dim × hidden_dim)
v[a] — attention combination vector (attn_dim)
enc_wx[h * input_dim + d] — encoder input→hidden (hidden_dim × input_dim)
enc_wh[h * hidden_dim + h2] — encoder hidden→hidden (hidden_dim × hidden_dim)
enc_b[h] — encoder hidden bias (hidden_dim)

Fields§

§hidden_dim: usize

Dimensionality of encoder/decoder hidden states.

§attn_dim: usize

Attention (alignment) inner dimension.

§input_dim: usize

Input feature dimensionality for the optional Elman encoder.

§w1: Vec<f64>

Encoder-state projection W1 (attn_dim × hidden_dim).

§w2: Vec<f64>

Decoder-query projection W2 (attn_dim × hidden_dim).

§v: Vec<f64>

Attention combination vector v (attn_dim).

§enc_wx: Vec<f64>

Elman encoder input→hidden weight (hidden_dim × input_dim).

§enc_wh: Vec<f64>

Elman encoder hidden→hidden weight (hidden_dim × hidden_dim).

§enc_b: Vec<f64>

Elman encoder hidden bias (hidden_dim).

Implementations§

Source §

impl PointerNetwork

Source

pub fn zeros( hidden_dim: usize, attn_dim: usize, input_dim: usize, ) -> SeqResult<Self>

Construct a zero-initialised pointer network.

All dimensions must be positive; otherwise SeqError::InvalidConfiguration.

Source

pub fn new( hidden_dim: usize, attn_dim: usize, input_dim: usize, scale: f64, rng: &mut LcgRng, ) -> SeqResult<Self>

Construct a pointer network with small random weights from a seeded LCG.

All weight matrices and v are sampled ~ U(-scale, scale); biases start at zero. scale must be finite and positive.

Source

pub fn encode(&self, inputs: &[f64]) -> SeqResult<Vec<f64>>

Encode an input feature sequence with a minimal Elman (tanh) RNN.

inputs is n × input_dim row-major; the returned buffer is n × hidden_dim row-major encoder states e_1 … e_n. Provided as a convenience; callers may instead pass their own embeddings to the attention methods directly.

Source

pub fn attention_logits( &self, encoder_states: &[f64], query: &[f64], ) -> SeqResult<Vec<f64>>

Raw attention logits u^i_j = vᵀ tanh(W1 e_j + W2 d_i) for one query.

encoder_states is n × hidden_dim; query is length hidden_dim.

Source

pub fn pointer_distribution( &self, encoder_states: &[f64], query: &[f64], ) -> SeqResult<Vec<f64>>

Pointer distribution softmax(u^i) over the n input positions for one decoder query.

Source

pub fn forward( &self, encoder_states: &[f64], queries: &[f64], ) -> SeqResult<Vec<f64>>

Forward pass over a sequence of decoder queries.

queries is m × hidden_dim row-major. Returns an m × n row-major matrix of pointer distributions (row i is the distribution for query i).

Source

pub fn decode( &self, encoder_states: &[f64], queries: &[f64], ) -> SeqResult<Vec<usize>>

Greedy decode: emit argmax_j p^i_j for each decoder query.

Returns a sequence of m pointed input indices, each in 0..n.

Source

pub fn nll( &self, encoder_states: &[f64], queries: &[f64], targets: &[usize], ) -> SeqResult<f64>

Teacher-forced negative log-likelihood of a target index sequence.

targets[i] is the gold input position pointed to at decoder step i; NLL = − Σ_i log p^i_{targets[i]}. Targets must be in 0..n.

Source

pub fn backward( &self, encoder_states: &[f64], queries: &[f64], targets: &[usize], ) -> SeqResult<(f64, PointerGrad)>

Gradient of the teacher-forced NLL w.r.t. the attention parameters (w1, w2, v).

Uses the softmax-cross-entropy identity ∂NLL/∂u^i_j = p^i_j − 1[j = tgt_i] and back-propagates through s_{j,a} = tanh(W1 e_j + W2 d_i)_a. Returns the NLL and a PointerGrad.

Source