Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
edlib_rs
This crate provides a Rust interface to the Edlib C++ library by Martin Šošić. See Martinsos-edlib
The reference paper is :
Martin Šošić, Mile Šikić; Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance. Bioinformatics 2017 [btw753. doi] https://doi.org/10.1093/bioinformatics/btw753
The crate offers 2 interfaces to edlib.
The first, accessed via module bindings, is direcly the interface generated by the bindgen crate.
The second, accessed via module edlibrs, provides a more idiomatic Rust interface. It comes at the cost of cloning information stored in pointers startLocations and endLocations in C struct EdlibAlignResult to get a Rust struct EdlibAlignResultRs with Option<Vec<u8>> fields instead of pointers. The cigar string representation is also cloned when computed.
As a consequence memory management is fully transferred to Rust.
Structures and functions have the same name as in edlib with just "Rs" appended to original names.
Example
For the edlibrs interface we have for example:
in normal mode:
use *;
...
let query = "ACCTCTG";
let target = "ACTCTGAAA";
let align_res = edlibAlignRs;
assert_eq!;
assert_eq!;
in the infix mode :
use *;
...
let query = "ACCTCTG";
let target = "TTTTTTTTTTTTTTTTTTTTTACTCTGAAA";
//
let mut config = default;
config.mode = EDLIB_MODE_HW;
let align_res = edlibAlignRs;
assert_eq!;
Installation
The crate relies on the C++ edlib library being installed and compiled as described in edlib documentation.
Before running cargo build (or cargo install) the environment variable EDLIB_DIR must be set to where the original C++ edlib directory was cloned. This is necessary for the build.rs step of Cargo to access the edlib library includes.
Also libstdc++ must be in your path.
The crate enables a logger to monitor the call to the C-interface which is by default set in Cargo.toml to info for release mode and trace for debug mode, but can changed by setting the variable RUST_LOG (see env_logger doc).
Tests
Some tests in module edlib.rs can serve as basic examples. Please note that cargo test must be run with variable EDLIB_DIR set. In directory examples there is also a small version of the edlib edaligner module (see apps/aligner in edlib installation dir) which runs on Fasta files containing only one sequence as contained in the edlib directory test_data. Contrary to the edlib version the module given a query and a target sequence runs the 3 modes (normal/NW, prefix/SHW and infix/HW) in one pass.
With RUST_LOG=info ./target/release/examples/edaligner --dirdata "$edlibpath/test_data/Enterobacteria_Phage_1" --tf "Enterobacteria_phage_1.fasta" --qf "mutated_90_perc.fasta"
we get the following timing in release mode for Enterobacteria_phage_1.fasta as target sequence and mutated_90_perc.fasta as query sequence.
mode | edlibrs time(s) | edlib time(s) | distance |
---|---|---|---|
NW | 0.106 | 0.106 | 9506 |
SHW | 0.184 | 0.191 | 9502 |
HW | 0.682 | 0.695 | 9502 |
We get the following timing in release mode for Enterobacteria_phage_1.fasta as target sequence and mutated_60_perc.fasta as query sequence.
mode | edlibrs time(s) | edlib time(s) | distance |
---|---|---|---|
NW | 0.398 | 0.398 | 39829 |
SHW | 0.670 | 0.684 | 39828 |
HW | 1.182 | 1.206 | 39828 |
Except for infinitesimal variations of cpu time measurement we see we have the same computation times.
License
Licensed under either of
- Apache License, Version 2.0, LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0
- MIT license LICENSE-MIT or http://opensource.org/licenses/MIT
at your option.
This software was written on my own while working at CEA, CEA-LIST