grepq-1.0.2 is not a library.
Visit the last successful build:
grepq-1.6.5
grepq
quickly filter fastq files by matching sequences to set of regex patterns
Performance
grepq is fast.
On a Mac Studio with 32GB RAM and Apple M1 max chip, grepq processed a 104GB fastq file in 88 seconds, about 1.2GB of fastq data per second.
- output of
timecommand
-
notes
- the regex file
regex.txtcontained 30 regex patterns, andSRX22685872.fastqis 104GB in size.
- the regex file
-
test hardware
- Model Name: Mac Studio
- Model Identifier: Mac13,1
- Model Number: MJMV3X/A (2022)
- Chip: Apple M1 Max
- Total Number of Cores: 10 (8 performance and 2 efficiency)
- Memory: 32 GB
- APPLE SSD AP0512R
- OS: macOS 15.0.1 (24A348)
Usage
<PATTERNS> Path )
<FILE> Path
)
) )
- tips
- order your regex patterns from those that are most likely to match to those that are least likely to match. This will speed up the filtering process.
- ensure you have enough storage space for the output file.
Requirements
grepqhas been tested on Linux and macOS. It might work on Windows, but it has not been tested.- ensure that rust is installed on your system (https://www.rust-lang.org/tools/install).
Installation
- is from source
- clone the repository and
cdinto thegrepqdirectory - run
cargo build --release - the executable will be located in
./target/release
- clone the repository and
Checksums to verify grepq is working correctly, using the regex file regex.txt and the small fastq file small.fastq, both located in the test directory:
License
MIT License