Expand description
§Ontolius
A fast and safe crate for working with biomedical ontologies.
§Examples
We provide examples of loading ontology and its subsequent usage in applications.
§Load ontology 🪄
ontolius can load ontology from Obographs JSON file.
For the sake of this example, we use
flate2
to decompress a gzipped JSON on the fly.
We can load a toy version of HPO from a JSON file as follows:
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use ontolius::io::OntologyLoaderBuilder;
use ontolius::ontology::csr::MinimalCsrOntology;
// Load a toy Obographs file from the repo
let path = "resources/hp.small.json.gz";
// Configure the loader to parse the input as an Obographs file
let loader = OntologyLoaderBuilder::new()
.obographs_parser()
.build();
let reader = GzDecoder::new(BufReader::new(File::open(path).unwrap()));
let hpo: MinimalCsrOntology = loader.load_from_read(reader)
.expect("HPO should be loaded");We loaded HPO from a toy JSON file into crate::ontology::csr::MinimalCsrOntology.
The loading includes parsing terms and edges from the Obographs file
and construction of the ontology graph.
In case of MinimalCsrOntology,
the graph is backed by a compressed sparse row (CSR) adjacency matrix.
See crate::io::OntologyLoader for more info on loading.
§Use ontology 🤸
In the previous section, we loaded an ontology from Obographs JSON file.
Now we have an instance of crate::ontology::csr::MinimalCsrOntology that can
be used for various tasks.
§Work with ontology terms
MinimalCsrOntology implements crate::ontology::OntologyTerms trait,
to support retrieval of specific terms by its index or TermId, and to iterate
over all terms and TermIds.
We can get a term by its TermId:
use ontolius::TermId;
use ontolius::term::MinimalTerm;
use ontolius::ontology::OntologyTerms;
// `HP:0001250` corresponds to `Arachnodactyly``
let term_id: TermId = "HP:0001166".parse().unwrap();
// Get the term by its term ID ...
let term = hpo.term_by_id(&term_id);
assert!(term.is_some());
/// ... and check its name.
let term = term.unwrap();
assert_eq!(term.name(), "Arachnodactyly");or iterate over the all ontology terms or their corresponding term IDs:
use ontolius::ontology::OntologyTerms;
// The toy HPO contains 614 terms and primary term ids,
let terms: Vec<_> = hpo.iter_terms().collect();
assert_eq!(terms.len(), 614);
assert_eq!(hpo.iter_term_ids().count(), 614);
// and the total of 1121 term ids (primary + obsolete)
assert_eq!(hpo.iter_all_term_ids().count(), 1121);See crate::ontology::OntologyTerms trait for more details.
§Browse the hierarchy
ontolius enables to leverage the ontology hierarchy
via several traits. This typically includes iteration over term’s parents, ancestors, children, or descendants.
The crate::ontology::HierarchyWalks trait supports iterating through TermIds whereas crate::ontology::HierarchyTraversals enables iteration over ontology graph indices. Iterating over indices is slightly faster and can be useful if we do not really care about the actual term IDs (e.g. to test if term a is an ancestor of term b).
The crate::ontology::HierarchyQueries simplifies testing if term a is a parent, child, ancestor, or descendant of term b.
In all cases, the hierarchy is represented as a directed acyclic graph that is built from is_a relationships.
Let’s see how to use the ontology hierarchy. For instance, we can use crate::ontology::HierarchyWalks::iter_parent_ids to get parent ids of a term:
use ontolius::ontology::{HierarchyWalks, OntologyTerms};
let arachnodactyly: TermId = "HP:0001166".parse()
.expect("CURIE should be valid");
let parent_names: Vec<_> = hpo.iter_parent_ids(&arachnodactyly)
.map(|idx| hpo.term_by_id(idx).expect("A term for a term ID obtained from ontology should always be present"))
.map(MinimalTerm::name)
.collect();
assert_eq!(vec!["Slender finger", "Long fingers"], parent_names);We first create the TermId that corresponds to Arachnodactyly and then we query hpo for its parents by calling iter_parent_ids. We retrieve the term that corresponds to term id, extract its name, and collect the names into a vector.
Similar methods exist for getting term IDs of ancestors, children, and descendants of a term. See crate::ontology::HierarchyWalks for more info.
§Supported ontologies
At this time, support for the following ontologies is tested:
- Human Phenotype Ontology (HPO)
- Gene Ontology (GO)
- Medical Action Ontology (MAxO)
Other ontologies are very likely to work too. In case of any problems, please let us know on our Issue tracker.
§Features
Ontolius includes several features, with the features marked by (*) being enabled
by default:
csr(*)- includecrate::ontology::csrmodule with implementation of ontology with graph backed by a CSR adjacency matrixobographs(*)- support loading Ontology from Obographs JSON filepyo3- include [crate::py] module with PyO3 bindings to selected data structs to support using from Python
§Run tests
The tests can be run by invoking:
cargo test§Run benches
We use criterion for crate benchmarks.
Run the following to run the bench suite:
cargo benchThe benchmark report will be written into the target/criterion/report directory.
Modules§
- common
- The module with constants for working with various ontologies.
- io
- Routines for loading ontology data.
- ontology
- A module with APIs for working with ontologies.
- term
- Ontology term models.
Structs§
Enums§
- Term
IdParse Error - Represents all possible reasons for failure to parse a CURIE into a
TermId.
Traits§
- Identified
Identifiedis implemented by entities that have aTermIdas an identifier.