bamnado 0.3.11

Tools and utilities for manipulation of BAM files for unusual use cases. e.g. single cell, MCC
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
# BamNado

High-performance tools and utilities for manipulation of BAM files for specialized use cases, including single cell and MCC (Multi-modal cellular characterization) workflows.

## Overview

BamNado is a Rust-based toolkit designed to handle complex BAM file operations that are common in modern genomics workflows, particularly in single-cell and multi-modal cellular characterization experiments. It provides efficient, cross-platform tools for coverage calculation, read filtering, file splitting, and various BAM file transformations.

## Installation

BamNado can be installed in several ways. Choose the method that best fits your needs:

### Method 1: Pre-built Binaries (Recommended)

The easiest way to get started is to download a pre-compiled binary from our [releases page](https://github.com/alsmith151/BamNado/releases).

#### Available Platforms

| Platform | Architecture | File Name |
|----------|-------------|-----------|
| Linux | x86_64 | `bamnado-x86_64-unknown-linux-gnu.tar.gz` |
| macOS | Intel (x86_64) | `bamnado-x86_64-apple-darwin.tar.gz` |
| macOS | Apple Silicon (ARM64) | `bamnado-aarch64-apple-darwin.tar.gz` |
| Windows | x86_64 | `bamnado-x86_64-pc-windows-msvc.zip` |

#### Installation Steps

1. **Download the binary**

   Go to the [releases page]https://github.com/alsmith151/BamNado/releases and download the appropriate file for your system.

2. **Extract the archive**

   **Linux/macOS:**

   ```bash
   tar -xzf bamnado-*.tar.gz
   ```

   **Windows:**
   - Right-click the zip file and select "Extract All"
   - Or use your preferred extraction tool (7-Zip, WinRAR, etc.)

3. **Make executable** (Linux/macOS only)

   ```bash
   chmod +x bamnado
   ```

4. **Test the installation**

   ```bash
   ./bamnado --version
   ```

   You should see output like: `bamnado 0.3.1`

5. **Install system-wide** (optional but recommended)

   **Option A: System-wide installation (requires admin privileges)**

   ```bash
   # Linux/macOS
   sudo cp bamnado /usr/local/bin/

   # Windows (as Administrator)
   # Copy bamnado.exe to C:\Windows\System32\ or add to PATH
   ```

   **Option B: User-local installation (no admin required)**

   ```bash
   # Linux/macOS
   mkdir -p ~/.local/bin
   cp bamnado ~/.local/bin/

   # Add to your shell profile if not already in PATH
   echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
   # or for zsh users:
   echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc

   # Reload your shell or run:
   source ~/.bashrc  # or ~/.zshrc
   ```

6. **Verify system installation**

   Open a new terminal and run:

   ```bash
   bamnado --version
   ```

#### Troubleshooting Pre-built Binaries

##### Linux: "No such file or directory" error

- Your system might be missing required libraries. Try:

  ```bash
  ldd bamnado  # Check dependencies
  ```

- For older Linux distributions, you may need to build from source.

##### macOS: "Cannot be opened because the developer cannot be verified"

- Run: `xattr -d com.apple.quarantine bamnado`
- Or go to System Preferences → Security & Privacy and allow the app

##### Windows: "Windows protected your PC"

- Click "More info" → "Run anyway"
- Or add an exception in Windows Defender

### Method 2: Install via Cargo

If you have Rust and Cargo installed, you can install BamNado directly from crates.io:

```bash
cargo install bamnado
```

**Prerequisites:**

- Rust 1.70+ (install from [rustup.rs]https://rustup.rs/)
- Cargo (comes with Rust)

**Advantages:**

- Always gets the latest published version
- Automatically handles dependencies
- Works on any platform supported by Rust

### Method 3: Build from Source

For the latest development version or if pre-built binaries don't work on your system:

#### Prerequisites

- Rust 2024 edition or later
- Git
- C compiler (for some dependencies)

**Install Rust if you haven't already:**

```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
```

#### Build Steps

1. **Clone the repository**

   ```bash
   git clone https://github.com/alsmith151/BamNado.git
   cd BamNado
   ```

2. **Build the project**

   ```bash
   # Debug build (faster compilation, slower execution)
   cargo build

   # Release build (slower compilation, faster execution - recommended)
   cargo build --release
   ```

3. **Test the build**

   ```bash
   # For debug build
   ./target/debug/bamnado --version

   # For release build
   ./target/release/bamnado --version
   ```

4. **Install system-wide** (optional)

   ```bash
   # Install from source
   cargo install --path .

   # Or manually copy the binary
   sudo cp target/release/bamnado /usr/local/bin/
   ```

#### Build Troubleshooting

##### Common Issues

##### Error: "linker 'cc' not found"

- **Ubuntu/Debian:** `sudo apt install build-essential`
- **CentOS/RHEL:** `sudo yum groupinstall "Development Tools"`
- **macOS:** Install Xcode Command Line Tools: `xcode-select --install`
- **Windows:** Install Visual Studio Build Tools or use WSL

##### Error: "failed to run custom build command for 'openssl-sys'"

- **Ubuntu/Debian:** `sudo apt install libssl-dev pkg-config`
- **CentOS/RHEL:** `sudo yum install openssl-devel pkgconf-pkg-config`
- **macOS:** Usually works out of the box with Homebrew
- **Windows:** Consider using the pre-built binaries instead

### Quick Start Verification

After installation, verify everything works:

```bash
# Check version
bamnado --version

# See available commands
bamnado --help

# Test with a simple command (replace with your BAM file)
bamnado bam-coverage --bam /path/to/your/file.bam --output test.bedgraph
```

## Usage

### Available Commands

BamNado provides several commands for different BAM file operations:

- `bam-coverage` - Calculate coverage from a BAM file and write to a bedGraph or bigWig file
- `multi-bam-coverage` - Calculate coverage from multiple BAM files and write to a bedGraph or bigWig file
- `split-exogenous` - Split a BAM file into endogenous and exogenous reads
- `split` - Split a BAM file based on a set of defined filters
- `modify` - Modify BAM files with various transformations

For detailed help on any command, use:

```bash
bamnado <command> --help
```

### Example: Calculating Coverage from a BAM File

#### Command

```bash
bamnado bam-coverage \
  --bam input.bam \
  --output output.bedgraph \
  --bin-size 100 \
  --norm-method rpkm \
  --scale-factor 1.5 \
  --use-fragment \
  --proper-pair \
  --min-mapq 30 \
  --min-length 50 \
  --max-length 500 \
  --blacklisted-locations blacklist.bed \
  --whitelisted-barcodes barcodes.txt
```

#### Explanation of Options

- `--bam`: Path to the input BAM file.
- `--output`: Path to the output file (e.g., `bedGraph` or `BigWig`).
- `--bin-size`: Size of genomic bins for coverage calculation.
- `--norm-method`: Normalization method (`raw`, `rpkm`, or `cpm`).
- `--scale-factor`: Scaling factor for normalization.
- `--use-fragment`: Use fragments instead of individual reads for counting.
- `--proper-pair`: Include only properly paired reads.
- `--min-mapq`: Minimum mapping quality for reads to be included (default: 20).
- `--min-length`: Minimum read length (default: 20).
- `--max-length`: Maximum read length (default: 1000).
- `--blacklisted-locations`: Path to a BED file specifying regions to exclude.
- `--whitelisted-barcodes`: Path to a file with barcodes to include.
- `--strand`: Filter reads based on strand (both, forward, reverse).
- `--shift`: Shift options for the pileup (default: 0,0,0,0).
- `--truncate`: Truncate options for the pileup.
- `--ignore-scaffold`: Ignore scaffold chromosomes.
- `--read-group`: Selected read group.

#### Output

The output file (`output.bedgraph`) will contain the normalized coverage data for the BAM file, filtered based on the specified criteria. BigWig files can also be generated by specifying the `--output` option with a `.bw` extension.

### Additional Commands

#### Multi-BAM Coverage

To calculate coverage from multiple BAM files:

```bash
bamnado multi-bam-coverage \
  --bams file1.bam file2.bam \
  --output output.bedgraph \
  --bin-size 100 \
  --norm-method rpkm \
  --scale-factor 1.5 \
  --use-fragment \
  --proper-pair \
  --min-mapq 30 \
  --min-length 50 \
  --max-length 500
```

#### Split BAM File into Endogenous and Exogenous Reads

To split a BAM file into endogenous and exogenous reads:

```bash
bamnado split-exogenous \
  --input input.bam \
  --output output_prefix \
  --exogenous-prefix "exo_" \
  --stats stats.json \
  --allow-unknown-mapq \
  --proper-pair \
  --min-mapq 30 \
  --min-length 50 \
  --max-length 500
```

#### Split BAM File by Cell Barcodes

To split a BAM file based on cell barcodes:

```bash
bamnado split \
  --input input.bam \
  --output output_prefix \
  --whitelisted-barcodes barcodes.txt \
  --proper-pair \
  --min-mapq 30 \
  --min-length 50 \
  --max-length 500
```

#### Modify BAM Files

To modify BAM files with various transformations:

```bash
bamnado modify \
  --input input.bam \
  --output output_prefix \
  --proper-pair \
  --min-mapq 30 \
  --min-length 50 \
  --max-length 500 \
  --tn5-shift
```

The `modify` command supports various filtering options and transformations like Tn5 shifting for ATAC-seq data processing.

## Help

For more details on available commands and options, run:

```bash
bamnado --help
```

Or for specific command help:

```bash
bamnado <command> --help
```

## Features

- **High Performance**: Built in Rust for maximum speed and memory efficiency
- **Cross-platform**: Available for Linux, macOS, and Windows
- **Multiple Output Formats**: Support for bedGraph and BigWig output formats
- **Flexible Filtering**: Comprehensive read filtering options including mapping quality, read length, proper pairs, and more
- **Single Cell Support**: Built-in support for cell barcode-based operations
- **MCC Workflows**: Specialized tools for Multi-modal Cellular Characterization
- **Strand-specific Analysis**: Support for strand-specific coverage calculations
- **Blacklist/Whitelist Support**: Region and barcode filtering capabilities

## Development

### Requirements

- Rust 2024 edition or later
- Cargo package manager

### Building from Source

```bash
git clone https://github.com/alsmith151/BamNado.git
cd BamNado
cargo build --release
```

### Running Tests

```bash
cargo test
```

### Pre-commit Hooks

This project uses pre-commit hooks to ensure code quality and consistency. The hooks run the same checks as the CI workflow:

- Code formatting (`cargo fmt`)
- Linting (`cargo clippy`)
- Basic checks (`cargo check`)
- Tests (`cargo test` on push)

#### Quick Setup

Run the setup script to install and configure pre-commit hooks:

```bash
./setup-precommit.sh
```

#### Manual Setup

If you prefer to set up pre-commit manually:

```bash
# Install pre-commit (choose one method)
pip install pre-commit
# or: brew install pre-commit
# or: conda install -c conda-forge pre-commit

# Install the hooks
pre-commit install
pre-commit install --hook-type pre-push

# Test the setup
pre-commit run --all-files
```

#### Configuration Options

Two pre-commit configurations are available:

- `.pre-commit-config.yaml` - Full checks including `cargo check` on every commit
- `.pre-commit-config-fast.yaml` - Faster setup with formatting/linting only, tests on push

To use the fast configuration:

```bash
mv .pre-commit-config.yaml .pre-commit-config-full.yaml
mv .pre-commit-config-fast.yaml .pre-commit-config.yaml
pre-commit install
```

#### Useful Commands

```bash
pre-commit run --all-files       # Run all hooks on all files
pre-commit run cargo-fmt         # Run specific hook
pre-commit autoupdate            # Update hook versions
pre-commit uninstall             # Remove hooks
```

## Release Information

### Version 0.3.1 (2025-07-09)

- Initial public release with comprehensive BAM file manipulation tools
- Support for single cell and MCC (Multi-modal Cellular Characterization) use cases
- Cross-platform binary builds available for Linux, macOS, and Windows
- High-performance Rust implementation
- Complete CI/CD pipeline with automated testing and releases

For detailed changelog information, see [CHANGELOG.md](CHANGELOG.md).

## License

This project is licensed under either of:

- Apache License, Version 2.0 ([LICENSE-APACHE]LICENSE-APACHE or <http://www.apache.org/licenses/LICENSE-2.0>)
- MIT license ([LICENSE-MIT]LICENSE-MIT or <http://opensource.org/licenses/MIT>)

at your option.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.