Crate mzcv

Crate mzcv 

Source
Expand description

§mzcv

Handle ontologies/controlled vocabularies with getting data from many sources:

  1. Download from a URL (only with feature http) [CVIndex::update_from_url]
  2. Parse from a given file CVIndex::update_from_path
  3. Open binary cache from the standardised cache location
  4. Update with values directly in memory CVIndex::update
  5. Use statically included data CVSource::static_data

When downloading a file it downloads it to the standardised location and compresses it with gzip compression to not take up too much space. When opening a file (such as a downloaded file) it first parses the file. If it succeeds it places the file at the standardised location and stores the parsed data in the binary cache. If it fails to parse the file it will report the errors both to the caller of the method and leave the errors next to the standard file location for end user convenience.

§Lookup

There are three major ways of looking up data: index, name, and fuzzy match search. The first two use HashMaps to do constant time lookups, the second uses a trigram index (when search-index is turned on, or loops over all data if not) and loops over all matches using Levenshtein distance to find good enough matches.

§Compilation features

  • http allow downloading ontologies from the internet
  • serde allow using serde (de)serialise to store data in the cache
  • search-index builds a trigram index to speed up fuzzy matching

Macros§

accession_code
Create an accession code, using this with a numeric code is match arm valid, using this with an alphanumeric code is const compatible.
curie
A curie. For documentation reasons a full term definition is also allowed. So both examples give the exact same Curie result:
term
Create a new term term!(MS:1002357|PSM-level probability). The accession/name combination is not validated.

Structs§

CVFile
The description of a file that is used to built a controlled vocabulary.
CVIndex
An index into a CV which contains the main ways of handling CVs.
CVVersion
Version information for a CV
Curie
A CURIE is a namespace + accession identifier
HashBufReader
A std::io::BufReader inspired design that also calculates the Hash of the read file.
Lines
An iterator over the lines of an instance of HashBufReader.
OboIdentifier
A (usually) unique identifier for an entry in an ontology
OboOntology
An Obo ontology. This can be read from a file with Self::from_file or from a raw reader with Self::from_raw.
OboStanza
An Obo stanza.
OboSynonym
A synonym in an Obo stanza
Term
A term, a CURIE plus its name

Enums§

AccessionCode
An accession code, Can either be a numeric code (u32 to 4 milion, so 9 fully utilised digits). Or it can be an ASCII alphanumeric code (case-sensitive) of 1 to 8 characters.
AccessionCodeParseError
An error when parsing an accession code
CURIEParsingError
An error that occured when parsing a CURIE
CVCompression
The used compression of the source CV.
CVError
An error encountered while parsing a CV
ControlledVocabulary
Controlled vocabularies all ontobee listed controlled vocabulaires are present as well as PRIDE and RESID
OboError
All possible errors when reading an Obo file
OboStanzaType
The type for an Obo stanza
OboValue
The value that a property value tag can have in Obo
SynonymScope
The type or scope for a synonym

Traits§

CVData
A data element from a CV. Note that technically an implementation could be made that does not have any index, name, or keywords for a data element. Such an element would be kept and stored but would not be accessible via anything else then crate::CVIndex::data.
CVSource
Implement this trait to create a new CV. The best way of using this is with a ZST (zero sized type).
CVStructure
A structure to contain CVData elements but leave the implementation up to the needs of the specific CV.