Skip to main content

Crate tinyhgvs

Crate tinyhgvs 

Source
Expand description

Lightweight HGVS variant parser.

tinyhgvs parses a HGVS variant into explicit Rust structs and enums that describe:

  • the reference sequence context such as NM_004006.2 or NP_003997.1
  • the coordinate type such as coding DNA (c.), genomic DNA (g.), RNA (r.), or protein (p.)
  • the biological description itself, represented as either a nucleotide variant or a protein consequence

The crate is intentionally small. It aims to represent common, high-value HGVS syntax clearly, while returning structured errors for syntax families tracked in the unsupported inventory.

The main entry points are:

§Reading the Parsed Model

The HgvsVariant separates a HGVS syntax into three top-level parts:

  • reference: the reference source for a variant.
  • coordinate_system: the one-letter HGVS coordinate type.
  • description: the nucleotide or protein variant description, including location and base edits or effects.

§Examples

A substitution crossing exon/intron border (intronic):

use tinyhgvs::{NucleotideAnchor, NucleotideEdit, VariantDescription, parse_hgvs};

let variant = parse_hgvs("NM_004006.2:c.357+1G>A").unwrap();
let description = variant.description;

match description {
    VariantDescription::Nucleotide(nucleotide) => {
        assert_eq!(nucleotide.location.start.anchor, NucleotideAnchor::Absolute);
        assert_eq!(nucleotide.location.start.coordinate, 357);
        assert_eq!(nucleotide.location.start.offset, 1);
        assert!(matches!(
            nucleotide.edit,
            NucleotideEdit::Substitution { ref reference, ref alternate }
                if reference == "G" && alternate == "A"
        ));
    }
    VariantDescription::Protein(_) => unreachable!("expected nucleotide variant"),
}

A nonsense mutation leading to an early termination at protein-level:

use tinyhgvs::{CoordinateSystem, ProteinEffect, VariantDescription, parse_hgvs};

let variant = parse_hgvs("NP_003997.1:p.Trp24Ter").unwrap();
assert_eq!(variant.coordinate_system, CoordinateSystem::Protein);

match variant.description {
    VariantDescription::Protein(protein) => {
        assert!(!protein.is_predicted);
        assert!(matches!(protein.effect, ProteinEffect::Edit { .. }));
    }
    VariantDescription::Nucleotide(_) => unreachable!("expected protein variant"),
}

Unsupported syntax is reported with a stable diagnostic code:

use tinyhgvs::parse_hgvs;

let error = parse_hgvs("NC_000001.11:g.[123G>A;345del]").unwrap_err();
assert_eq!(error.code(), "unsupported.allele");

Structs§

Accession
A parsed accession with optional version.
CopiedSequenceItem
Sequence copied from the same or another reference.
HgvsVariant
A parsed HGVS variant.
Interval
Inclusive interval used for nucleotide and protein locations.
LiteralSequenceItem
Literal inserted or replacement bases such as A or AGGG.
NucleotideCoordinate
Nucleotide coordinate with explicit anchor and offset.
NucleotideRepeatBlock
One repeated block/unit in a nucleotide repeat variant description.
NucleotideVariant
Parsed nucleotide location and edit.
ParseHgvsError
A structured error returned when an HGVS string cannot be parsed.
ProteinCoordinate
Protein coordinate written as amino-acid symbol plus ordinal.
ProteinSequence
Ordered protein insertion or replacement sequence.
ProteinVariant
Parsed protein consequence.
ReferenceSpec
Reference metadata preceding the : in an HGVS expression.
RepeatSequenceItem
Repeated inserted or replacement sequence such as N[12].

Enums§

CoordinateSystem
HGVS coordinate system.
NucleotideAnchor
Anchor used by nucleotide coordinates.
NucleotideEdit
Supported nucleotide edit families.
NucleotideSequenceItem
A single sequence item inside a nucleotide insertion or deletion-insertion.
ParseHgvsErrorKind
High-level classes of parse failures exposed by tinyhgvs.
ProteinEdit
Supported protein edit families in the first release.
ProteinEffect
Supported protein consequence forms.
VariantDescription
Top-level variant description for nucleotide or protein syntax.

Functions§

parse_hgvs
Parses an HGVS string into the Rust HgvsVariant model.