Seqkmer
Seqkmer is a Rust library for high-throughput sequence IO and k-mer based analyses. It provides fast readers for FASTA/FASTQ (including gzipped streams), k-mer minimizer scanning, and utilities to parallelise bulk sequence processing.
Highlights
- Universal FASTX readers: Seamlessly handle FASTA, FASTQ, interleaved paired-end, and dual-file paired-end datasets through a unified API. Automatic format detection and transparent gzip support are included.
- Quality-aware FASTQ parsing: Optional quality-score thresholds to soft-mask low-quality bases while preserving original sequence layout.
- Buffered & streaming modes: Choose between streaming (
FastaReader,FastqReader) or buffered variants (BufferFastaReader) depending on your throughput/memory trade-offs. - Minimizer-based k-mer scanning: The
mmscannermodule exposesscan_sequenceandMinimizerIteratorfor fast k-mer/minimizer enumeration with configurable windows. - Parallel orchestration: Utilities in
parallelcoordinate multi-threaded reading and processing pipelines using scoped thread pools.
Getting Started
Add Seqkmer to your project:
Reading FASTA or FASTQ
use ;
use Path;
For paired-end data, provide a pair of paths. Interleaved FASTQ is detected automatically; separate R1/R2 files are also supported:
let paths = Pair;
let mut reader = from_paths?;
K-mer Minimizer Scanning
use ;
use Reader;
Parallel Pipelines
Use read_parallel when you need to map a function across batches using multiple threads:
use ;
Feature Overview
| Module | Purpose |
|---|---|
fasta |
FASTA readers (streaming + buffered) |
fastq |
FASTQ reader with automatic interleaved detection and quality masking |
fastx |
Format-agnostic wrapper over FASTA/FASTQ readers |
reader |
Misc IO utilities (gzip detection, trim helpers, file format detection) |
parallel |
Threaded reader orchestration using scoped thread pools |
mmscanner |
Minimizer scanning over DNA sequences |
feat |
K-mer feature helper types (Meros, constants) |
utils::OptionPair |
Helper enum for representing single vs paired resources |
Testing
All functionality is covered by unit and doc tests. Run the full suite with:
License
Seqkmer is distributed under the terms of the MIT License.