Expand description
§mzcv
Handle ontologies/controlled vocabularies with getting data from many sources:
- Download from a URL (only with feature
http) [CVIndex::update_from_url] - Parse from a given file
CVIndex::update_from_path - Open binary cache from the standardised cache location
- Update with values directly in memory
CVIndex::update - Use statically included data
CVSource::static_data
When downloading a file it downloads it to the standardised location and compresses it with gzip compression to not take up too much space. When opening a file (such as a downloaded file) it first parses the file. If it succeeds it places the file at the standardised location and stores the parsed data in the binary cache. If it fails to parse the file it will report the errors both to the caller of the method and leave the errors next to the standard file location for end user convenience.
§Lookup
There are three major ways of looking up data: index, name, and fuzzy match search. The first two use HashMaps to do constant time lookups, the second uses a trigram index (when search-index is turned on, or loops over all data if not) and loops over all matches using Levenshtein distance to find good enough matches.
§Compilation features
httpallow downloading ontologies from the internetserdeallow using serde (de)serialise to store data in the cachesearch-indexbuilds a trigram index to speed up fuzzy matching
Macros§
- accession_
code - Create an accession code, using this with a numeric code is match arm valid, using this with an alphanumeric code is const compatible.
- curie
- A curie. For documentation reasons a full term definition is also allowed. So both examples
give the exact same
Curieresult: - term
- Create a new term
term!(MS:1002357|PSM-level probability). The accession/name combination is not validated.
Structs§
- CVFile
- The description of a file that is used to built a controlled vocabulary.
- CVIndex
- An index into a CV which contains the main ways of handling CVs.
- CVVersion
- Version information for a CV
- Curie
- A CURIE is a namespace + accession identifier
- Hash
BufReader - A
std::io::BufReaderinspired design that also calculates the Hash of the read file. - Lines
- An iterator over the lines of an instance of
HashBufReader. - OboIdentifier
- A (usually) unique identifier for an entry in an ontology
- OboOntology
- An Obo ontology. This can be read from a file with
Self::from_fileor from a raw reader withSelf::from_raw. - OboStanza
- An Obo stanza.
- OboSynonym
- A synonym in an Obo stanza
- Term
- A term, a CURIE plus its name
Enums§
- Accession
Code - An accession code, Can either be a numeric code (u32 to 4 milion, so 9 fully utilised digits). Or it can be an ASCII alphanumeric code (case-sensitive) of 1 to 8 characters.
- Accession
Code Parse Error - An error when parsing an accession code
- CURIE
Parsing Error - An error that occured when parsing a CURIE
- CVCompression
- The used compression of the source CV.
- CVError
- An error encountered while parsing a CV
- Controlled
Vocabulary - Controlled vocabularies all ontobee listed controlled vocabulaires are present as well as PRIDE and RESID
- OboError
- All possible errors when reading an Obo file
- OboStanza
Type - The type for an Obo stanza
- OboValue
- The value that a property value tag can have in Obo
- Synonym
Scope - The type or scope for a synonym
Traits§
- CVData
- A data element from a CV. Note that technically an implementation could be made that does not
have any index, name, or keywords for a data element. Such an element would be kept and stored
but would not be accessible via anything else then
crate::CVIndex::data. - CVSource
- Implement this trait to create a new CV. The best way of using this is with a ZST (zero sized type).
- CVStructure
- A structure to contain
CVDataelements but leave the implementation up to the needs of the specific CV.