bamnado 0.6.0

Tools and utilities for manipulation of BAM files for unusual use cases. e.g. single cell, MCC
Documentation

BamNado

High-performance tools and utilities for working with BAM and BigWig files in modern genomics workflows. BamNado is written in Rust for speed and low memory use and provides both a command-line interface and Python bindings.

Overview

BamNado is designed for efficient, streaming manipulation of BAM files and signal tracks. It focuses on fast coverage generation, flexible filtering, and lightweight post-processing of bedGraph and BigWig data.

Common use cases:

  • Rapid generation of coverage tracks from large BAM files
  • Filtering reads by tags or barcodes to produce targeted BigWigs
  • Fragment-aware coverage for ATAC-seq and related assays
  • BigWig comparison and aggregation across samples
  • Post-processing of binned signal tracks for visualization

Useful in workflows including single-cell and Micro-Capture-C (MCC), and many others.

Documentation: API docs on docs.rs

Features

  • High-performance, streaming implementations in Rust
  • Cross-platform support (Linux, macOS, Windows)
  • BAM → bedGraph / BigWig coverage generation
  • Fragment-aware and strand-specific pileups
  • Read filtering by mapping quality, length, strand, fragment size, tags, and barcodes
  • BigWig comparison (subtraction, ratio, log-ratio)
  • BigWig aggregation (sum, mean, median, min, max)
  • bedGraph post-processing with collapse-bedgraph
  • Python bindings for selected functionality

Installation

Pre-built binaries (recommended)

Download the appropriate binary from the releases page.

After downloading:

chmod +x bamnado
./bamnado --version

(Optional) install system-wide:

sudo cp bamnado /usr/local/bin/

Docker

docker pull ghcr.io/alsmith151/bamnado:latest
docker run --rm ghcr.io/alsmith151/bamnado:latest --help

Images are available for linux/amd64 and linux/arm64.

Cargo

If you have Rust installed:

cargo install bamnado

Build from source

git clone https://github.com/alsmith151/BamNado.git
cd BamNado
cargo build --release

Optional dependency: samtools

samtools is not required but is strongly recommended if your BAM files have non-standard or incomplete headers (e.g. files produced by CellRanger). BamNado automatically falls back to samtools view -H to parse the header when the built-in parser fails. Without samtools on your PATH, BamNado will error on such files.

Install via conda or your system package manager:

conda install -c bioconda samtools
# or
brew install samtools

Python API

BamNado provides Python bindings for high-performance BAM signal generation with flexible read filtering. Install from PyPI:

pip install bamnado
# or
uv pip install bamnado

Quick start

import bamnado
import numpy as np

# Generate coverage from a BAM file
signal = bamnado.get_signal_for_chromosome(
    bam_path="input.bam",
    chromosome_name="chr1",
    bin_size=50,
)
print(f"Mean coverage: {np.mean(signal):.2f}")

ReadFilter class

Customize read filtering with the ReadFilter class. All parameters are optional:

import bamnado

# Create a filter for high-quality, properly-paired reads
strict_filter = bamnado.ReadFilter(
    min_mapq=30,
    proper_pair=True,
    min_length=50,
)

# Apply the filter
signal = bamnado.get_signal_for_chromosome(
    bam_path="input.bam",
    chromosome_name="chr1",
    bin_size=100,
    read_filter=strict_filter,
)

Filter parameters

Parameter Type Default Description
min_mapq int 0 Minimum mapping quality
proper_pair bool False Require properly paired reads
min_length int 0 Minimum read length (bp)
max_length int 1000 Maximum read length (bp)
strand str "both" "forward", "reverse", or "both"
min_fragment_length int | None None Minimum insert size (bp); paired-end only
max_fragment_length int | None None Maximum insert size (bp); paired-end only
blacklist_bed str | None None BED file of excluded regions
whitelisted_barcodes list[str] | None None Cell barcodes to include (CB tag)
read_group str | None None Read group to keep (RG tag)
filter_tag str | None None SAM tag to filter on (e.g. "VP")
filter_tag_value str | None None Required value for filter_tag

Usage examples

Basic coverage

import bamnado
import numpy as np

signal = bamnado.get_signal_for_chromosome(
    bam_path="sample.bam",
    chromosome_name="chr1",
    bin_size=50,
    scale_factor=1.0,
    use_fragment=False,
)
print(f"Signal shape: {signal.shape}, type: {signal.dtype}")
print(f"Mean: {np.mean(signal):.2f}, Max: {np.max(signal):.2f}")

Nucleosome-free regions (ATAC-seq)

import bamnado

# Forward-strand reads, 100–200 bp fragments
nfr_filter = bamnado.ReadFilter(
    strand="forward",
    min_fragment_length=100,
    max_fragment_length=200,
    min_mapq=20,
)

nfr_signal = bamnado.get_signal_for_chromosome(
    bam_path="atac.bam",
    chromosome_name="chr1",
    bin_size=10,
    use_fragment=True,
    read_filter=nfr_filter,
)

Barcode-filtered coverage (single-cell)

import bamnado

# Get coverage for specific cell barcodes
cell_barcodes = ["ACGTACGT-1", "TGCATGCA-1", "AAATAAAA-1"]

barcode_filter = bamnado.ReadFilter(
    whitelisted_barcodes=cell_barcodes,
    min_mapq=25,
)

signal = bamnado.get_signal_for_chromosome(
    bam_path="possorted_genome_bam.bam",
    chromosome_name="chr1",
    bin_size=100,
    use_fragment=True,
    read_filter=barcode_filter,
)

Tag-filtered coverage (e.g. MCC viewpoint)

import bamnado

# Get coverage for specific viewpoint
vp_filter = bamnado.ReadFilter(
    filter_tag="VP",
    filter_tag_value="BCL2",
    min_mapq=30,
)

vp_signal = bamnado.get_signal_for_chromosome(
    bam_path="mcc.bam",
    chromosome_name="chr1",
    bin_size=50,
    use_fragment=True,
    read_filter=vp_filter,
)

High-confidence coverage

import bamnado

# High-quality, properly-paired reads
confident_filter = bamnado.ReadFilter(
    min_mapq=30,
    proper_pair=True,
    min_length=50,
    min_fragment_length=100,
    max_fragment_length=500,
)

signal = bamnado.get_signal_for_chromosome(
    bam_path="sample.bam",
    chromosome_name="chr1",
    bin_size=50,
    scale_factor=1e6,  # CPM normalization
    use_fragment=True,
    read_filter=confident_filter,
)

get_signal_for_chromosome parameters

Parameter Type Default Description
bam_path str Path to indexed BAM file
chromosome_name str Chromosome to process (e.g. "chr1")
bin_size int 50 Bin width in base pairs
scale_factor float 1.0 Linear scaling factor
use_fragment bool False Use fragment (pair) coordinates instead of read
ignore_scaffold_chromosomes bool True Skip non-standard chromosomes
read_filter ReadFilter | None None Optional read filter

Returns: numpy.ndarray of dtype float32 with length = chromosome size / bin_size

Note: Fragment length filtering (min_fragment_length, max_fragment_length) requires paired-end BAM files and will raise ValueError on single-end data.

Command-line interface

Getting help

bamnado --help                      # List all commands
bamnado <command> --help            # Help for a specific command

Available commands

  • Coverage generation: bam-coverage (coverage), multi-bam-coverage (multi-coverage)
  • BAM manipulation: split, split-exogenous, modify
  • BigWig tools: bigwig-compare (compare-bigwigs), bigwig-aggregate (aggregate-bigwigs)
  • bedGraph tools: collapse-bedgraph (collapse)

Common naming cleanups

The CLI now exposes shorter, more descriptive option names. Older names still work as compatibility aliases.

Read filtering

All coverage commands share common read filter flags:

Flag Default Description
--strand both forward, reverse, or both
--proper-pairs off Keep only properly-paired reads
--min-mapq 20 Minimum mapping quality
--min-length 20 Minimum read length (bp)
--max-length 1000 Maximum read length (bp)
--min-fragment-len Minimum insert size (bp); paired-end only
--max-fragment-len Maximum insert size (bp); paired-end only
--blacklist BED file of regions to exclude
--barcode-allowlist Text file of cell barcodes (one per line)
--read-group Keep only this read group
--tag / --tag-value Keep reads where TAG == VALUE

Coverage-specific flags

Flag Default Description
--normalize raw Signal normalization method: raw, rpkm, or cpm
--fragment-counts off Count fragments instead of individual read alignments
--ignore-scaffolds off Skip scaffold or unplaced chromosomes
--threads 6 Threads used when writing BigWig output

Examples

Generate basic coverage

bamnado bam-coverage \
  --bam sample.bam \
  --output coverage.bedgraph \
  --bin-size 100

Generate high-quality, normalized coverage

bamnado bam-coverage \
  --bam sample.bam \
  --output coverage_hq.bw \
  --bin-size 100 \
  --normalize rpkm \
  --min-mapq 30 \
  --proper-pairs

Extract nucleosome-free regions from ATAC-seq

bamnado bam-coverage \
  --bam atac.bam \
  --output nfr_forward.bw \
  --bin-size 10 \
  --strand forward \
  --min-fragment-len 100 \
  --max-fragment-len 200 \
  --fragment-counts \
  --min-mapq 20

Filter by cell barcode (single-cell)

bamnado bam-coverage \
  --bam possorted_genome_bam.bam \
  --output cell_coverage.bw \
  --bin-size 100 \
  --barcode-allowlist barcodes.txt \
  --fragment-counts

Filter by SAM tag (MCC viewpoint)

bamnado bam-coverage \
  --bam mcc.bam \
  --output BCL2_viewpoint.bw \
  --bin-size 50 \
  --tag VP \
  --tag-value BCL2 \
  --fragment-counts \
  --min-mapq 30

Compare two BigWig files

bamnado bigwig-compare \
  --bw1 sample_treated.bw \
  --bw2 sample_control.bw \
  --comparison log-ratio \
  --pseudocount 1e-3 \
  -o treated_vs_control.bw

Aggregate multiple BigWig files

bamnado bigwig-aggregate \
  --bigwigs sample1.bw sample2.bw sample3.bw \
  --method mean \
  -o mean_coverage.bw

Simplify bedGraph file

bamnado collapse-bedgraph \
  --input signal.bedgraph \
  --output signal.collapsed.bedgraph

Development

cargo build --release
cargo test

License

Apache-2.0 OR MIT