Skip to main content

Crate jellyfish_reader

Crate jellyfish_reader 

Source
Expand description

§jellyfish-reader

A pure Rust library for reading Jellyfish k-mer counting output files.

Jellyfish is a fast, memory-efficient tool for counting k-mers in DNA sequences, widely used in bioinformatics. This crate provides native Rust readers for Jellyfish’s binary and text output formats, with no C/C++ dependencies.

§Features

  • Sequential reading of binary/sorted and text/sorted Jellyfish files
  • Random-access queries via memory-mapped I/O with binary search
  • K-mer representation (MerDna) with canonical form, reverse complement, and all standard operations
  • String k-mer extraction matching Jellyfish’s StringMers interface
  • Auto-format detection from file headers

§Quick Start

use jellyfish_reader::{ReadMerFile, MerDna, QueryMerFile};

// Sequential reading
let reader = ReadMerFile::open("output.jf").unwrap();
for result in reader {
    let (mer, count) = result.unwrap();
    println!("{}: {}", mer, count);
}

// Random access
let qf = QueryMerFile::open("output.jf").unwrap();
let mer: MerDna = "ACGTACGTACGTACGTACGTACGTA".parse().unwrap();
if let Some(count) = qf.get(&mer) {
    println!("Count: {}", count);
}

§K-mer Operations

use jellyfish_reader::MerDna;

let mer: MerDna = "ACGT".parse().unwrap();

// Reverse complement
let rc = mer.get_reverse_complement();
assert_eq!(rc.to_string(), "ACGT"); // ACGT is a palindrome

// Canonical form (lexicographically smaller of self and RC)
let canonical = mer.get_canonical();

// Extract k-mers from a sequence
use jellyfish_reader::StringMers;
let kmers: Vec<_> = StringMers::new("ACGTACGT", 4)
    .map(|m| m.to_string())
    .collect();
assert_eq!(kmers, vec!["ACGT", "CGTA", "GTAC", "TACG", "ACGT"]);

Re-exports§

pub use binary::BinaryReader;
pub use error::Error;
pub use error::Result;
pub use header::FileHeader;
pub use matrix::RectangularBinaryMatrix;
pub use mer::MerDna;
pub use query::QueryMerFile;
pub use string_mers::StringMers;
pub use string_mers::string_canonicals;
pub use string_mers::string_mers;
pub use text::TextReader;

Modules§

binary
error
header
matrix
mer
query
string_mers
text

Enums§

ReadMerFile
Unified sequential reader for Jellyfish output files.