Expand description
HMMER - Biological sequence analysis using profile hidden Markov models.
This is a Rust port of HMMER 3.4, a C tool for searching sequence databases for homologous sequences using profile HMMs.
§Quick Start
use hmmer_pure_rs::{Alphabet, Bg, Hmm, Profile, Pipeline, TopHits, OProfile};
use hmmer_pure_rs::hmmfile;
use hmmer_pure_rs::profile::{profile_config, reconfig_length, P7_LOCAL};
use hmmer_pure_rs::sequence::Sequence;
use std::path::Path;
// Load an HMM
let hmms = hmmfile::read_hmm_file(Path::new("query.hmm")).unwrap();
let hmm = &hmms[0];
// Set up alphabet, background model, and scoring profile
let abc = Alphabet::new(hmm.abc_type);
let bg = Bg::new(&abc);
let mut gm = Profile::new(hmm.m, &abc);
profile_config(hmm, &bg, &mut gm, 400, P7_LOCAL);
let mut om = OProfile::convert(&gm);
// Create pipeline and hits collector
let mut pli = Pipeline::new();
pli.new_model(&gm);
let mut th = TopHits::new();
// Search a sequence programmatically
let dsq = abc.digitize(b"ACDEFGHIKLMNPQRSTVWY");
let sq = Sequence { name: "query".into(), acc: String::new(), desc: String::new(), dsq, n: 20, l: 20 };
pli.run(&mut gm, &mut om, &bg, hmm, &sq, &mut th);
// Access results
th.sort_by_sortkey();
for hit in &th.hits {
println!("{}: score={:.1} bits", hit.name, hit.score);
}Re-exports§
pub use alphabet::Alphabet;pub use bg::Bg;pub use hmm::Hmm;pub use pipeline::Pipeline;pub use profile::Profile;pub use sequence::Sequence;pub use simd::oprofile::OProfile;pub use tophits::TopHits;
Modules§
- alphabet
- Biological sequence alphabets (DNA, RNA, amino acid). Direct port of Easel’s esl_alphabet.
- bg
- P7_BG - Null/background model for statistical significance. Direct port of p7_bg.c.
- builder
- HMM builder — construct profile HMMs from multiple sequence alignments. Simplified port of p7_builder.c and build.c.
- calibrate
- E-value calibration by simulation. Port of evalues.c p7_Calibrate().
- domaindef
- Domain definition using posterior probabilities. Port of p7_domaindef.c p7_domaindef_ByPosteriorHeuristics().
- dp
- dsqdata
- Binary sequence database format for fast reading. Simplified port of esl_dsqdata — stores digital sequences in a compact binary format.
- errors
- HMMER error types, mirroring Easel’s eslOK/eslFAIL/eslERANGE system.
- eweight
- Effective sequence number estimation. Port of eweight.c — entropy-based Neff computation.
- fm_
index - FM-index for fast pattern searching in DNA sequences. Used by nhmmer for long-target scanning.
- hmm
- P7_HMM - Core profile HMM model. Direct port of HMMER’s P7_HMM data structure.
- hmmfile
- HMM file I/O: reading and writing HMMER3 format HMM files. Direct port of p7_hmmfile.c.
- hmmfile_
binary - Binary HMM file I/O — reading C HMMER’s .h3m format. Enables interoperability with C hmmpress output.
- logsum
- Fast table-driven log-sum approximation for the Forward algorithm.
- msa
- Multiple Sequence Alignment (MSA) I/O — Stockholm format.
- output
- Output formatting utilities to match C printf behavior.
- pipeline
- Search pipeline using SIMD MSV + Viterbi + Forward filters.
- pressed
- Read/write C HMMER pressed database format (.h3f, .h3p, .h3i). Enables reading databases created by C hmmpress.
- prior
- Dirichlet mixture priors for HMM parameterization. Simplified port of p7_prior.c.
- profile
- P7_PROFILE - Scoring profile for sequence comparison. Direct port of p7_profile.c and modelconfig.c.
- seqmodel
- Build an HMM from a single sequence using a substitution matrix. Simplified port of seqmodel.c p7_Seqmodel().
- sequence
- Biological sequence type and FASTA I/O. Simplified port of esl_sq and esl_sqio_ascii for FASTA format.
- simd
- spensemble
- Segment pair ensemble clustering for multidomain regions. Simplified port of p7_spensemble.c.
- ssi
- Simple sequence/HMM index for fast random access by name. Simplified alternative to Easel’s esl_ssi.c.
- stats
- tophits
- P7_TOPHITS - Ranked hit list for search results. Simplified port focused on what hmmsearch needs.
- trace
- P7_TRACE — Viterbi traceback path through the HMM. Port of p7_trace.c and generic_vtrace.c.
- util