Skip to main content

Crate gapsmith_find

Crate gapsmith_find 

Source
Expand description

Pathway / reaction finder.

Mirrors the R reference implementation spread across src/prepare_batch_alignments.R, src/analyse_alignments.R, and src/gapseq_find.sh. The pipeline:

  1. pathways::select picks a subset of gapsmith_db::PathwayRow to evaluate, based on a user keyword or pattern.
  2. seqfile::resolve_for_reaction walks the dat/seq/ tree to find the reference FASTA(s) for each reaction.
  3. runner::run_find builds one concatenated query.faa, runs the alignment (via gapsmith_align::Aligner) against the input genome, classifies every hit, and aggregates per-pathway completeness.
  4. output emits *-Reactions.tbl and *-Pathways.tbl in gapseq’s column order.

Complex / subunit detection (src/complex_detection.R) lives in complex; it has point-by-point R-parity on 9 handcrafted cases (greek / latin numerals, size-dict synonyms, coverage edges). See crates/gapsmith-find/tests/complex_parity.rs.

§End-to-end parity

runner::run_find produces byte-identical *-Pathways.tbl output against real gapseq on two test cases (-p PWY-6587 and -p amino on toy/ecore.faa). See crates/gapsmith-find/tests/pipeline_parity.rs.

Re-exports§

pub use classify::classify_hits;
pub use classify::ClassifyOptions;
pub use output::write_pathways_tbl;
pub use output::write_reactions_tbl;
pub use pathways::select;
pub use pathways::ExpandedReaction;
pub use pathways::PathwaySelectOptions;
pub use runner::run_find;
pub use runner::FindError;
pub use runner::FindOptions;
pub use runner::FindReport;
pub use seqfile::resolve_for_reaction;
pub use seqfile::ResolvedSeq;
pub use seqfile::SeqfileOptions;
pub use types::HitStatus;
pub use types::PathwayResult;
pub use types::PwyStatus;
pub use types::ReactionHit;

Modules§

classify
Hit classification — port of src/analyse_alignments.R:108–143.
complex
Port of src/complex_detection.R.
dbhit
Reaction → SEED reaction ID lookup (“dbhit”).
output
Emit *-Reactions.tbl and *-Pathways.tbl in gapseq’s column order.
pathways
Pathway selection and per-reaction expansion.
runner
End-to-end driver for gapsmith find.
seqfile
Reference-FASTA resolver.
taxonomy
Minimal loader for dat/taxonomy.tbl — used to filter pathways by their taxrange column.
types
Result types produced by the find pipeline.