bamnado 0.5.1

Tools and utilities for manipulation of BAM files for unusual use cases. e.g. single cell, MCC
Documentation

BamNado

High-performance tools and utilities for working with BAM and BigWig files in modern genomics workflows. BamNado is written in Rust for speed and low memory use and provides both a command-line interface and Python bindings.


Overview

BamNado is designed for efficient, streaming manipulation of BAM files and signal tracks. It focuses on fast coverage generation, flexible filtering, and lightweight post-processing of bedGraph and BigWig data.

Common use cases include:

  • Rapid generation of coverage tracks from large BAM files
  • Filtering reads by tags or barcodes to produce targeted BigWigs
  • Fragment-aware coverage for ATAC-seq and related assays
  • BigWig comparison and aggregation across samples
  • Post-processing of binned signal tracks for visualization

BamNado is useful in a range of workflows, including single-cell and Micro-Capture-C (MCC), but is not limited to those applications.


Features

  • High-performance, streaming implementations in Rust
  • Cross-platform support (Linux, macOS, Windows)
  • BAM → bedGraph / BigWig coverage generation
  • Fragment-aware and strand-specific pileups
  • Read filtering by mapping quality, length, strand, fragment size, tags, and barcodes
  • BigWig comparison (subtraction, ratio, log-ratio)
  • BigWig aggregation (sum, mean, median, min, max)
  • collapse-bedgraph utility to merge adjacent bins with identical scores
  • Python bindings for selected functionality

Installation

Pre-built binaries (recommended)

Download the appropriate binary from the releases page.

After downloading:

chmod +x bamnado
./bamnado --version

(Optional) install system-wide:

sudo cp bamnado /usr/local/bin/

Docker

docker pull ghcr.io/alsmith151/bamnado:latest
docker run --rm ghcr.io/alsmith151/bamnado:latest --help

Images are available for linux/amd64 and linux/arm64.


Cargo

If you have Rust installed:

cargo install bamnado

Build from source

git clone https://github.com/alsmith151/BamNado.git
cd BamNado
cargo build --release

Optional dependency: samtools

samtools is not required but is strongly recommended if your BAM files have non-standard or incomplete headers (e.g. files produced by CellRanger). BamNado automatically falls back to samtools view -H to parse the header when the built-in parser fails. Without samtools on your PATH, BamNado will error on such files.

Install via conda or your system package manager:

conda install -c bioconda samtools
# or
brew install samtools

Python Interface

BamNado provides Python bindings for selected high-performance operations and is available directly from PyPI.

Python Installation

pip install bamnado
# or
uv pip install bamnado

ReadFilter

All read filtering options are controlled through the ReadFilter class:

Parameter Type Default Description
min_mapq int 0 Minimum mapping quality score
proper_pair bool True Keep only properly paired reads
min_length int 0 Minimum read length (bp)
max_length int 1000 Maximum read length (bp)
strand str "both" Strand to keep: "forward" / "fwd" / "+", "reverse" / "rev" / "-", or "both"
min_fragment_length int | None None Minimum insert size / TLEN (bp); requires paired-end data
max_fragment_length int | None None Maximum insert size / TLEN (bp); requires paired-end data
blacklist_bed str | None None Path to a BED file of regions to exclude
whitelisted_barcodes list[str] | None None Cell barcodes (CB tag) to include
read_group str | None None Read group (RG tag) to keep
filter_tag str | None None Two-character SAM tag to filter on (e.g. "VP")
filter_tag_value str | None None Required string value for filter_tag

A ValueError is raised if min_fragment_length or max_fragment_length is set on a single-end BAM file.

Example

import bamnado
import numpy as np

# Basic coverage — default filter settings
signal = bamnado.get_signal_for_chromosome(
    bam_path="input.bam",
    chromosome_name="chr1",
    bin_size=50,
    scale_factor=1.0,
    use_fragment=False,
    ignore_scaffold_chromosomes=True,
)
print(f"Mean coverage: {np.mean(signal):.3f}")

# Forward-strand nucleosome-free region coverage (100–200 bp fragments)
nfr_filter = bamnado.ReadFilter(
    strand="forward",
    min_fragment_length=100,
    max_fragment_length=200,
    min_mapq=20,
)
nfr_signal = bamnado.get_signal_for_chromosome(
    bam_path="input.bam",
    chromosome_name="chr1",
    bin_size=10,
    scale_factor=1.0,
    use_fragment=True,
    ignore_scaffold_chromosomes=True,
    read_filter=nfr_filter,
)

# Tag-filtered coverage (e.g. MCC viewpoint)
vp_filter = bamnado.ReadFilter(
    filter_tag="VP",
    filter_tag_value="BCL2",
    min_mapq=30,
)
vp_signal = bamnado.get_signal_for_chromosome(
    bam_path="input.bam",
    chromosome_name="chr1",
    bin_size=50,
    scale_factor=1.0,
    use_fragment=True,
    ignore_scaffold_chromosomes=True,
    read_filter=vp_filter,
)

Command-line usage

List available commands:

bamnado --help

Get help for a specific command:

bamnado <command> --help

Available commands

  • bam-coverage – generate coverage from a BAM file
  • multi-bam-coverage – coverage from multiple BAMs
  • split – split BAMs based on filters (e.g. barcodes)
  • split-exogenous – split endogenous vs exogenous reads
  • modify – apply transformations and filters to BAMs
  • bigwig-compare – compare two BigWigs
  • bigwig-aggregate – aggregate multiple BigWigs
  • collapse-bedgraph – merge adjacent bedGraph bins with identical scores

Read filtering

All coverage commands share a common set of read filter flags:

Flag Default Description
--strand both Include only forward, reverse, or both strands
--proper-pair off Keep only properly-paired reads
--min-mapq 20 Minimum mapping quality
--min-length 20 Minimum read sequence length (bp)
--max-length 1000 Maximum read sequence length (bp)
--min-fragment-length Minimum insert size / TLEN (bp); requires paired-end data
--max-fragment-length Maximum insert size / TLEN (bp); requires paired-end data
--blacklisted-locations BED file of regions to exclude
--whitelisted-barcodes Text file of cell barcodes to keep (one per line)
--read-group Keep only reads belonging to this read group
--filter-tag / --filter-tag-value Keep reads where SAM tag equals the given value

Fragment length filtering operates on the SAM TLEN field and is only meaningful for paired-end BAMs. BamNado will return an error if these flags are used with a single-end file.


Example: BAM coverage

bamnado bam-coverage \
  --bam input.bam \
  --output output.bedgraph \
  --bin-size 100 \
  --norm-method rpkm \
  --scale-factor 1.5 \
  --use-fragment \
  --proper-pair \
  --min-mapq 30

Example: strand- and fragment-length-filtered coverage

Useful for isolating nucleosome-free regions in ATAC-seq data:

bamnado bam-coverage \
  --bam atac.bam \
  --output nfr_forward.bw \
  --bin-size 10 \
  --use-fragment \
  --strand forward \
  --min-fragment-length 100 \
  --max-fragment-length 200 \
  --min-mapq 20

Example: tag-filtered BigWig generation

bamnado bam-coverage \
  --bam input.bam \
  --output BCL2.bw \
  --bin-size 50 \
  --filter-tag "VP" \
  --filter-tag-value "BCL2" \
  --use-fragment \
  --min-mapq 30

BigWig comparison

bamnado bigwig-compare \
  --bw1 sample1.bw \
  --bw2 sample2.bw \
  --comparison log-ratio \
  --pseudocount 1e-3 \
  -o output.bw

BigWig aggregation

bamnado bigwig-aggregate \
  --bigwigs sample1.bw sample2.bw sample3.bw \
  --method mean \
  -o aggregated.bw

collapse-bedgraph

bamnado collapse-bedgraph \
  --input signal.bedgraph \
  --output signal.collapsed.bedgraph

Development

cargo build --release
cargo test

License

Apache-2.0 OR MIT