# Ontolius
Empower analysis with terms and hierarchy of biomedical ontologies.
## Examples
We provide examples of *loading* ontology and its subsequent *usage*
in applications.
### πͺπͺπͺ Load HPO
`ontolius` can load HPO from Obographs JSON file.
For the sake of this example, we use
[flate2](https://github.com/rust-lang/flate2-rs)
to decompress gzipped JSON on the fly:
We can load the JSON file as follows:
```rust
use std::fs::File;
use std::io::BufReader;
use flate2::bufread::GzDecoder;
use ontolius::io::OntologyLoaderBuilder;
use ontolius::ontology::csr::MinimalCsrOntology;
// Load a toy Obographs file from the repo
let path = "resources/hp.small.json.gz";
// Configure the loader to parse the input as an Obographs file
let loader = OntologyLoaderBuilder::new()
.obographs_parser()
.build();
let reader = GzDecoder::new(BufReader::new(File::open(path).unwrap()));
let hpo: MinimalCsrOntology = loader.load_from_read(reader)
.expect("HPO should be loaded");
```
> Note: Ontolius does *not* depend on `flate2`. It's up to you to provide
> the `loader` with proper data π±
We loaded an ontology from a toy JSON file.
During the load, each term is assigned a numeric index and the indices are used as vertices
of the ontology graph.
As the name suggests, the hierarchy graph of `MinimalCsrOntology`
is backed by an adjacency matrix in compressed sparse row (CSR) format.
However, the backing data structure should be treated as an implementation detail.
Note that `MinimalCsrOntology` implements the [`crate::ontology::Ontology`] trait
which is the API the client code should use.
Now let's move on to the example usage.
### π€Έπ€Έπ€Έ Use HPO
In the previous section, we loaded an ontology from Obographs JSON file.
Now we have an instance of [`crate::ontology::Ontology`] that can
be used for various tasks.
Note, we will import the *prelude* [`crate::prelude`] to reduce the import boilerplate.
#### Work with ontology terms
[`crate::ontology::Ontology`] acts as a container of terms to support
retrieval of specific terms by its index or `TermId`, and to iterate
over all terms and `TermId`s.
We can get a term by its `TermId`:
```rust
# use std::fs::File;
# use std::io::BufReader;
# use flate2::bufread::GzDecoder;
# use ontolius::io::OntologyLoaderBuilder;
# use ontolius::ontology::csr::MinimalCsrOntology;
# let loader = OntologyLoaderBuilder::new().obographs_parser().build();
# let reader = GzDecoder::new(BufReader::new(File::open("resources/hp.small.json.gz").unwrap()));
# let hpo: MinimalCsrOntology = loader.load_from_read(reader)
# .expect("HPO should be loaded");
#
use ontolius::prelude::*;
// `HP:0001250` corresponds to `Arachnodactyly``
let term_id: TermId = ("HP", "0001166").into();
// Get the term by its term ID and check the name.
let term = hpo.id_to_term(&term_id).expect("Arachnodactyly should be present");
assert_eq!(term.name(), "Arachnodactyly");
```
or iterate over the all ontology terms or their corresponding term IDs:
```rust
# use std::fs::File;
# use std::io::BufReader;
# use flate2::bufread::GzDecoder;
# use ontolius::io::OntologyLoaderBuilder;
# use ontolius::ontology::csr::MinimalCsrOntology;
# let loader = OntologyLoaderBuilder::new().obographs_parser().build();
# let reader = GzDecoder::new(BufReader::new(File::open("resources/hp.small.json.gz").unwrap()));
# let hpo: MinimalCsrOntology = loader.load_from_read(reader)
# .expect("HPO should be loaded");
#
use ontolius::prelude::*;
// The toy HPO contains 614 terms and primary term ids,
let terms: Vec<_> = hpo.iter_terms().collect();
assert_eq!(terms.len(), 614);
assert_eq!(hpo.iter_term_ids().count(), 614);
// and the total of 1121 term ids (primary + obsolete)
assert_eq!(hpo.iter_all_term_ids().count(), 1121);
```
See [`crate::ontology::HierarchyAware`] for more details.
### Browse the hierarchy
`ontolius` models the ontology hierarchy using the [`crate::hierarchy::OntologyHierarchy`] trait,
an instance of which is available from `Ontology`.
The hierarchy is represented as a directed acyclic graph that is built from `is_a` relationships.
The graph vertices correspond to term indices (not `TermId`s) that are determined
when the ontology is built.
All methods of the ontology hierarchy operate in the term index space. The indices have
all properties of `TermId`s, and can, therefore, be used *in lieu* of the `TermId`s.
Let's see how to use the ontology hierarchy. For instance, to get the parent terms of a term.
```rust
# use std::fs::File;
# use std::io::BufReader;
# use flate2::bufread::GzDecoder;
# use ontolius::io::OntologyLoaderBuilder;
# use ontolius::ontology::csr::MinimalCsrOntology;
# let loader = OntologyLoaderBuilder::new().obographs_parser().build();
# let reader = GzDecoder::new(BufReader::new(File::open("resources/hp.small.json.gz").unwrap()));
# let hpo: MinimalCsrOntology = loader.load_from_read(reader)
# .expect("HPO should be loaded");
#
use ontolius::prelude::*;
let hierarchy = hpo.hierarchy();
let arachnodactyly: TermId = ("HP", "0001166").into();
let idx = hpo.id_to_idx(&arachnodactyly)
.expect("Arachnodacyly should be in HPO");
let parents: Vec<_> = hierarchy.iter_parents_of(idx)
.flat_map(|idx| hpo.idx_to_term(idx))
.collect();
let names: Vec<_> = parents.iter().map(|term| term.name()).collect();
assert_eq!(vec!["Slender finger", "Long fingers"], names);
```
Similar methods exist for getting ancestors, children, and descendent terms.
See [`crate::hierarchy::OntologyHierarchy`] for more details.
That's it for now.
## Features
Ontolius includes several features, with the features marked by `(*)` being enabled
by default:
* `obographs` `(*)` - support loading Ontology from Obographs JSON file
* `pyo3` - add PyO3 bindings to selected data structs to support using from Python
## Run tests
The tests can be run by invoking:
```shell
cargo test
```
## Run benches
We use `criterion` for crate benchmarks.
Run the following to run the bench suite:
```shell
cargo bench
```
The benchmark report will be written into the `target/criterion/report` directory.