bamrescue 0.0.1

Utility to check Binary Sequence Alignment / Map (BAM) files for corruption and repair them
bamrescue-0.0.1 is not a library.
Visit the last successful build: bamrescue-0.3.0

bamrescue License

bamrescue is a small command line utility to check Binary Sequence Alignment / Map (BAM) files for corruption and repair them.

How it works

A BAM file is a BGZF file (specification), and as such is composed of a series of concatenated RFC1592-compliant gzip blocks (specification).

Each gzip block contains at most 64 KiB of data, including a CRC16 checksum of the gzip header and a CRC32 checksum of the gzip data which are used to check data integrity.

Additionally, since gzip blocks start with a gzip identifier (ie. 0x1f8b), it is possible to skip over corrupted blocks (at most 64 KiB) to the next non-corrupted block with limited complexity and acceptable reliability.

This property is used to repair corrupted BAM files by keeping only their non-corrupted blocks, hopefully rescuing most reads.

Compilation

Run cargo build --release in your working copy.

Installation

Copy the bamrescue binary wherever you want.

Usage

Run bamrescue <bamfile_to_check_or_repair> <output_bamfile>.

Contributing and reporting bugs

Contributions are welcome through GitHub pull requests.

Please report bugs and feature requests on GitHub issues.

License

bamrescue is copyright (C) 2017 Jérémie Roquet jroquet@arkanosis.net and licensed under the ISC license.