SAM/BAM file utilities and header manipulation.
This module provides utilities for working with SAM/BAM files, including:
- Checking and validating SAM header sort orders
- Reversing and reverse-complementing per-base tag values
- Template-coordinate sorting validation
- Test utilities for building SAM/BAM records
- Record-level utilities (position mapping, FR pair detection, CIGAR parsing)
- Read-pair clipping utilities
Sort Orders
The module supports several important sort orders:
- queryname - Reads sorted by query name (required for grouping by UMI)
- template-coordinate - Special sort order from fgbio where reads are grouped by query name but ordered by genomic position within each template
Tag Manipulation
Functions are provided to reverse or reverse-complement per-base tag values when reads are mapped to the negative strand, ensuring tag values match the orientation of the read sequence.
Record Utilities
The [record_utils] submodule provides utilities for working with individual records:
- [
record_utils::read_pos_at_ref_pos] - Map reference position to read position - [
record_utils::is_fr_pair_from_tags] - Check if read is part of FR pair using tags - [
record_utils::mate_unclipped_start] / [record_utils::mate_unclipped_end] - Get mate boundaries from MC tag - [
record_utils::num_bases_extending_past_mate] - Calculate overlap with mate - [
record_utils::parse_cigar_string] - Parse CIGAR string to operations