Expand description
This module handles the Equivalance class to gene mapping
An EC represents a set of transcripts which the read is consistent with. This is then mapped to genes:
- the EC resolves to a single gene, clear case for a count
- the EC resolves to multipel genes (the read aligned to a part of the transcriptome which is ambigous)
It gets more complicated if we have multiple ECs for an mRNA,
i.e. multiple busrecords with same CB/UMI, but different EC
This happens since mRNAs get fragmented after amplification, and different fragments can map to different parts
of the transcriptome, hence yielding different ECs
See Ec2GeneMapper and MappingResult
Structs§
- CB
- Thin wrapper around u64, a cell-barcode
- CUGset
- Represents a busrecord (actually an observed CB/UIM combination) with a set of consistent genes (genes that this CB/UMI could map to) and the number of times this combination was seen in the busfile
- EC
- Thin wrapper aroud u32, representing an Equivalence Class
- Ec2Gene
Mapper - Deals with the EC (equivalence class) to gene mapping Resolve a given EC into a set of genes consistent with that EC
- GeneId
- Gene identifier
- Genename
- Name of a particular gene
Enums§
- Inconsistent
Resolution - if we come across a CB/UMI the has inconsistent mapping i.e. mapping to 2 different genes, how to handle
- Mapping
Mode - Mapping
Result - MappingResult represents an attempt to unify several BusRecords with matching CB/UMI into an mRNA expressed from a single gene.
Functions§
- find_
consistent - For a set of busrecords (coming from the same molecule), this function tries to map those records to genes consistently.
- groubygene
- Group a set of busrecords (same CB/UMI) by genes they are consistent with