Expand description
Codon translation, DNA complement, and VEP-style display formatting.
All functions operate on uppercase ASCII bytes (b'A', b'C', b'G',
b'T'). The codon lookup tables use an internal 6-bit encoding but callers
never see it — pass raw ASCII in, get ASCII out.
§Genetic codes
Two translation tables are provided:
- Standard (NCBI table 1) — used for all autosomal and sex-chromosome
transcripts.
translate_codonuses this table. - Vertebrate mitochondrial (NCBI table 2) — used for chrM transcripts.
Four codons differ:
TGA→W,AGA→*,AGG→*,ATA→M.
translate_codon_for_transcript dispatches to the correct table based on
an is_mitochondrial flag (derived from transcript.chrom == "chrM").
Functions§
- aa_
three_ letter - Convert a one-letter amino acid code to its three-letter abbreviation.
- complement
- Complement a single DNA base.
A↔T,C↔G. Non-ACGT bytes (e.g.,N) pass through unchanged. - complement_
in_ place - Complement a DNA sequence in place. Each base is replaced with its Watson-Crick complement; non-ACGT bytes are left unchanged.
- format_
amino_ acids - Format amino acid change for VEP display.
- format_
amino_ acids_ indel - Format amino acid change for indels.
- format_
codons - Format ref/alt codons with VEP’s capitalization convention.
- format_
codons_ indel - Format ref/alt codon sequences for an indel with VEP’s capitalisation convention.
- reverse_
complement - Return the reverse complement of a DNA sequence.
- translate_
codon - Translate a 3-base codon to a single amino acid character using the standard genetic code (NCBI table 1).
- translate_
codon_ for_ transcript - Translate a codon using the appropriate genetic code for a transcript.
- translate_
codon_ mito - Translate a 3-base codon using the vertebrate mitochondrial genetic code (NCBI table 2).
- translate_
sequence - Translate a DNA sequence to amino acids, codon by codon.