Module wfa

Expand description

WFA — the Wavefront Alignment algorithm for gap-affine global alignment.

This is a faithful implementation of the algorithm of Marco-Sola, Moure, Moreto & Espinosa, “Fast gap-affine pairwise alignment using the wavefront algorithm”, Bioinformatics 37(4):456–463, 2021.

§Model

WFA minimizes an alignment penalty under the gap-affine model

match     → 0
mismatch  → x          (x > 0)
gap run   → o + k·e    (gap-open o paid once per run, plus e per gap symbol
                         INCLUDING the first; o ≥ 0, e > 0)

Instead of filling an (m+1)·(n+1) dynamic-programming matrix, WFA tracks, for each increasing penalty s, the furthest-reaching point on every diagonal k = i − j (the “wavefront”). On similar sequences only a narrow band of diagonals is ever touched, which is what makes the algorithm fast.

Three wavefront components are maintained per penalty s:

M — the match / substitution path (the alignment proper),
I — the insertion path (a gap in a, consuming a character of b),
D — the deletion path (a gap in b, consuming a character of a).

§Convention

We align a along the rows (index i, length m) and b along the columns (index j, length n). A diagonal is k = i − j. The offset stored on a diagonal is i, the number of characters of a consumed, so the matching column is j = i − k. The optimum is reached when the M wavefront on the final diagonal k_final = m − n attains offset m (equivalently i = m, j = n).

WfaOp::Ins is a gap in a: it consumes one character of b only.
WfaOp::Del is a gap in b: it consumes one character of a only.

§Cross-check with Gotoh

crate::alignment::gotoh::gotoh_align solves the same problem but maximizes a score. Given a GotohScoring (M, mis, go, ge) we derive the (×2-scaled, integral) WFA penalties

x = 2·(M − mis)        // mismatch penalty
o = 2·(ge − go)        // gap-open penalty
e = M − 2·ge           // gap-extend penalty

Run WFA to obtain the minimum penalty P; the equivalent Gotoh maximum score is then

gotoh_score = ((m + n)·M − P) / 2

which is exact (the division is always even). WfaAlignment reports both the raw penalty and the converted score.

Structs§

WfaAlignment: Result of a WFA gap-affine global alignment.

Enums§

WfaOp: A single edit operation of a WFA alignment, in left-to-right order.

Functions§

wfa_align: Run the WFA gap-affine global alignment of a against b.

Module wfa

Module wfa Copy item path

§Model

§Convention

§Cross-check with Gotoh

Structs§

Enums§

Functions§

Module wfa