Skip to main content

Module mmseqs2

Module mmseqs2 

Source
Expand description

MMseqs2 aligner (protein-protein).

Mirrors gapseq’s explicit 4-command pipeline from src/gapseq_find.sh:603–615:

mmseqs createdb TARGET targetDB
mmseqs createdb QUERY queryDB
mmseqs search queryDB targetDB resultDB tmp --threads N -c 0.C
mmseqs convertalis queryDB targetDB resultDB out.tsv
    --format-output "qheader,pident,evalue,bits,qcov,theader,tstart,tend"

Rationale: mmseqs easy-search would collapse the four steps into one, but it defaults to align --alignment-mode 3 (full-alignment identity) whereas search reports the k-mer prefilter identity. Identity cutoffs downstream (analyse_alignments.R) were calibrated against the latter — so we match that semantics exactly. Native coverage is a 0–1 fraction; we rescale to 0–100 via crate::tsv::parse_tsv.

Structs§

Mmseqs2Aligner