Expand description
Lightweight HGVS variant parser.
tinyhgvs parses a HGVS variant into explicit Rust structs and enums
that describe:
- the reference sequence context such as
NM_004006.2orNP_003997.1 - the coordinate type such as coding DNA (
c.), genomic DNA (g.), RNA (r.), or protein (p.) - the biological description itself, represented as either a nucleotide variant or a protein consequence
The crate is intentionally small. It aims to represent common, high-value HGVS syntax clearly, while returning structured errors for syntax families tracked in the unsupported inventory.
The main entry points are:
parse_hgvsto parse a string intoHgvsVariantParseHgvsErrorto inspect invalid or unsupported input
§Reading the Parsed Model
The HgvsVariant separates a HGVS syntax into three top-level parts:
reference: the reference source for a variant.coordinate_system: the one-letter HGVS coordinate type.description: the nucleotide or protein variant description, including location and base edits or effects.
§Examples
A substitution crossing exon/intron border (intronic):
use tinyhgvs::{NucleotideAnchor, NucleotideEdit, VariantDescription, parse_hgvs};
let variant = parse_hgvs("NM_004006.2:c.357+1G>A").unwrap();
let description = variant.description;
match description {
VariantDescription::Nucleotide(nucleotide) => {
assert_eq!(nucleotide.location.start.anchor, NucleotideAnchor::Absolute);
assert_eq!(nucleotide.location.start.coordinate, 357);
assert_eq!(nucleotide.location.start.offset, 1);
assert!(matches!(
nucleotide.edit,
NucleotideEdit::Substitution { ref reference, ref alternate }
if reference == "G" && alternate == "A"
));
}
VariantDescription::Protein(_) => unreachable!("expected nucleotide variant"),
}A nonsense mutation leading to an early termination at protein-level:
use tinyhgvs::{CoordinateSystem, ProteinEffect, VariantDescription, parse_hgvs};
let variant = parse_hgvs("NP_003997.1:p.Trp24Ter").unwrap();
assert_eq!(variant.coordinate_system, CoordinateSystem::Protein);
match variant.description {
VariantDescription::Protein(protein) => {
assert!(!protein.is_predicted);
assert!(matches!(protein.effect, ProteinEffect::Edit { .. }));
}
VariantDescription::Nucleotide(_) => unreachable!("expected protein variant"),
}Unsupported syntax is reported with a stable diagnostic code:
use tinyhgvs::parse_hgvs;
let error = parse_hgvs("NC_000001.11:g.[123G>A;345del]").unwrap_err();
assert_eq!(error.code(), "unsupported.allele");Structs§
- Accession
- A parsed accession with optional version.
- Copied
Sequence Item - Sequence copied from the same or another reference.
- Hgvs
Variant - A parsed HGVS variant.
- Interval
- Inclusive interval used for nucleotide and protein locations.
- Literal
Sequence Item - Literal inserted or replacement bases such as
AorAGGG. - Nucleotide
Coordinate - Nucleotide coordinate with explicit anchor and offset.
- Nucleotide
Repeat Block - One repeated block/unit in a nucleotide repeat variant description.
- Nucleotide
Variant - Parsed nucleotide location and edit.
- Parse
Hgvs Error - A structured error returned when an HGVS string cannot be parsed.
- Protein
Coordinate - Protein coordinate written as amino-acid symbol plus ordinal.
- Protein
Sequence - Ordered protein insertion or replacement sequence.
- Protein
Variant - Parsed protein consequence.
- Reference
Spec - Reference metadata preceding the
:in an HGVS expression. - Repeat
Sequence Item - Repeated inserted or replacement sequence such as
N[12].
Enums§
- Coordinate
System - HGVS coordinate system.
- Nucleotide
Anchor - Anchor used by nucleotide coordinates.
- Nucleotide
Edit - Supported nucleotide edit families.
- Nucleotide
Sequence Item - A single sequence item inside a nucleotide insertion or deletion-insertion.
- Parse
Hgvs Error Kind - High-level classes of parse failures exposed by
tinyhgvs. - Protein
Edit - Supported protein edit families in the first release.
- Protein
Effect - Supported protein consequence forms.
- Variant
Description - Top-level variant description for nucleotide or protein syntax.
Functions§
- parse_
hgvs - Parses an HGVS string into the Rust
HgvsVariantmodel.