Skip to main content

Module codon

Module codon 

Source
Expand description

Codon translation, DNA complement, and VEP-style display formatting.

All functions operate on uppercase ASCII bytes (b'A', b'C', b'G', b'T'). The codon lookup tables use an internal 6-bit encoding but callers never see it — pass raw ASCII in, get ASCII out.

§Genetic codes

Two translation tables are provided:

  • Standard (NCBI table 1) — used for all autosomal and sex-chromosome transcripts. translate_codon uses this table.
  • Vertebrate mitochondrial (NCBI table 2) — used for chrM transcripts. Four codons differ: TGA→W, AGA→*, AGG→*, ATA→M.

translate_codon_for_transcript dispatches to the correct table based on an is_mitochondrial flag (derived from transcript.chrom == "chrM").

Functions§

aa_three_letter
Convert a one-letter amino acid code to its three-letter abbreviation.
complement
Complement a single DNA base. A↔T, C↔G. Non-ACGT bytes (e.g., N) pass through unchanged.
complement_in_place
Complement a DNA sequence in place. Each base is replaced with its Watson-Crick complement; non-ACGT bytes are left unchanged.
format_amino_acids
Format amino acid change for VEP display.
format_amino_acids_indel
Format amino acid change for indels.
format_codons
Format ref/alt codons with VEP’s capitalization convention.
format_codons_indel
Format ref/alt codon sequences for an indel with VEP’s capitalisation convention.
reverse_complement
Return the reverse complement of a DNA sequence.
translate_codon
Translate a 3-base codon to a single amino acid character using the standard genetic code (NCBI table 1).
translate_codon_for_transcript
Translate a codon using the appropriate genetic code for a transcript.
translate_codon_mito
Translate a 3-base codon using the vertebrate mitochondrial genetic code (NCBI table 2).
translate_sequence
Translate a DNA sequence to amino acids, codon by codon.