Expand description
§jellyfish-reader
A pure Rust library for reading Jellyfish k-mer counting output files.
Jellyfish is a fast, memory-efficient tool for counting k-mers in DNA sequences, widely used in bioinformatics. This crate provides native Rust readers for Jellyfish’s binary and text output formats, with no C/C++ dependencies.
§Features
- Sequential reading of binary/sorted and text/sorted Jellyfish files
- Random-access queries via memory-mapped I/O with binary search
- K-mer representation (
MerDna) with canonical form, reverse complement, and all standard operations - String k-mer extraction matching Jellyfish’s
StringMersinterface - Auto-format detection from file headers
§Quick Start
use jellyfish_reader::{ReadMerFile, MerDna, QueryMerFile};
// Sequential reading
let reader = ReadMerFile::open("output.jf").unwrap();
for result in reader {
let (mer, count) = result.unwrap();
println!("{}: {}", mer, count);
}
// Random access
let qf = QueryMerFile::open("output.jf").unwrap();
let mer: MerDna = "ACGTACGTACGTACGTACGTACGTA".parse().unwrap();
if let Some(count) = qf.get(&mer) {
println!("Count: {}", count);
}§K-mer Operations
use jellyfish_reader::MerDna;
let mer: MerDna = "ACGT".parse().unwrap();
// Reverse complement
let rc = mer.get_reverse_complement();
assert_eq!(rc.to_string(), "ACGT"); // ACGT is a palindrome
// Canonical form (lexicographically smaller of self and RC)
let canonical = mer.get_canonical();
// Extract k-mers from a sequence
use jellyfish_reader::StringMers;
let kmers: Vec<_> = StringMers::new("ACGTACGT", 4)
.map(|m| m.to_string())
.collect();
assert_eq!(kmers, vec!["ACGT", "CGTA", "GTAC", "TACG", "ACGT"]);Re-exports§
pub use binary::BinaryReader;pub use error::Error;pub use error::Result;pub use header::FileHeader;pub use matrix::RectangularBinaryMatrix;pub use mer::MerDna;pub use query::QueryMerFile;pub use string_mers::StringMers;pub use string_mers::string_canonicals;pub use string_mers::string_mers;pub use text::TextReader;
Modules§
Enums§
- Read
MerFile - Unified sequential reader for Jellyfish output files.