fgumi-sam 0.1.2

SAM/BAM record utilities, CIGAR parsing, and read-pair clipping for fgumi
Documentation

SAM/BAM file utilities and header manipulation.

This module provides utilities for working with SAM/BAM files, including:

  • Checking and validating SAM header sort orders
  • Reversing and reverse-complementing per-base tag values
  • Template-coordinate sorting validation
  • Test utilities for building SAM/BAM records
  • Record-level utilities (position mapping, FR pair detection, CIGAR parsing)
  • Read-pair clipping utilities

Sort Orders

The module supports several important sort orders:

  • queryname - Reads sorted by query name (required for grouping by UMI)
  • template-coordinate - Special sort order from fgbio where reads are grouped by query name but ordered by genomic position within each template

Tag Manipulation

Functions are provided to reverse or reverse-complement per-base tag values when reads are mapped to the negative strand, ensuring tag values match the orientation of the read sequence.

Record Utilities

The [record_utils] submodule provides utilities for working with individual records:

  • [record_utils::read_pos_at_ref_pos] - Map reference position to read position
  • [record_utils::is_fr_pair_from_tags] - Check if read is part of FR pair using tags
  • [record_utils::mate_unclipped_start] / [record_utils::mate_unclipped_end] - Get mate boundaries from MC tag
  • [record_utils::num_bases_extending_past_mate] - Calculate overlap with mate
  • [record_utils::parse_cigar_string] - Parse CIGAR string to operations