bio-jtools-2022.6.6 is not a library.

bio-jtools-rs

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust

Crates.io

Suite

info

Extract and print metadata about an HTS file. For FASTQs, this includes number of bases, number of records, and all the instruments the records come from.

filter

Filter an HTS file by its query names. Currently only implemented for SAM/BAM files

jaccard

Calculate the Jaccard index for each pair in a set of BED files. Can save the results in a comma-separated file, if specified.

org

Organize a batch of raw sequencing data.

This takes a folder directly from an Illumina sequencer with FASTQ files and organizes them as follows, ready for alginment and quality control:

YYMMDD_INSTID_RUN_FCID/
├── FASTQs/                     # home for your raw data
    ├── Sample1_R1.fastq.gz
    ├── Sample1_R2.fastq.gz
    └── ...
├── Aligned/                    # a home for your aligned data
├── Reports/                    # QC reports, etc files
├── config.tsv                  # a table of samples (rows) x features (cols)
├── cluster.yaml                # a yaml file of cluster parameters for jobs in the Snakefile
├── README.md                   # description of the folder, data contents
├── setup.log                   # a log of what operations were performed with `bjt org`
└── Snakefile                   # Snakemake workflow file

Benchmarking

Benchmarks on run in a Windows 10 computer, Intel i7 960 @ 3.2 GHz processor with 12 GB DDR3 of RAM. Times are listed +/- standard deviation, using hyperfine --warmup 3.

info

File	# Reads	tgzip	tplain
`examples/SRR0000001.fastq`	2 500	26.2 ms +/- 1.8 ms	22.8 ms +/- 3.2 ms
`examples/SRR0000002.fastq`	25 000	73.0 ms +/- 7.0 ms	27.5 ms +/- 3.2 ms
`M_abscessus_HiSeq.fq`	5 682 010	8.701 s +/- 0.291 s	1.026 s +/- 0.011 s

Roadmap

filter

Filtering FASTX files.

bio-jtools 2022.6.6

bio-jtools-rs

Suite

info

filter

jaccard

org

Benchmarking

info

Roadmap

filter