Expand description

This module handles the Equivalance class to gene mapping

An EC represents a set of transcripts which the read is consistent with. This is then mapped to genes:

  • the EC resolves to a single gene, clear case for a count
  • the EC resolves to multipel genes (the read aligned to a part of the transcriptome which is ambigous)

It gets more complicated if we have multiple ECs for an mRNA, i.e. multiple busrecords with same CB/UMI, but different EC This happens since mRNAs get fragmented after amplification, and different fragments can map to different parts
of the transcriptome, hence yielding different ECs

See Ec2GeneMapper and MappingResult


  • Thin wrapper around u64, a cell-barcode
  • Represents a busrecord (actually an observed CB/UIM combination) with a set of consistent genes (genes that this CB/UMI could map to) and the number of times this combination was seen in the busfile
  • Thin wrapper aroud u32, representing an Equivalence Class
  • Deals with the EC (equivalence class) to gene mapping Resolve a given EC into a set of genes consistent with that EC
  • Gene identifier
  • Name of a particular gene


  • if we come across a CB/UMI the has inconsistent mapping i.e. mapping to 2 different genes, how to handle
  • MappingResult represents an attempt to unify several BusRecords with matching CB/UMI into an mRNA expressed from a single gene.


  • For a set of busrecords (coming from the same molecule), this function tries to map those records to genes consistently.
  • Group a set of busrecords (same CB/UMI) by genes they are consistent with