PhyloDM
Efficient calculation of pairwise phylogenetic distance matrices.
Installation
The easiest installation method is through Conda. If you choose to install via PyPI ensure that you have a Rust compiler.
- PyPI:
pip install phylodm - Conda:
conda install -c bioconda phylodm
Usage
Creating a phylogenetic distance matrix
A phylogenetic distance matrix (PhyloDM) object can be created from a newick file:
# Create a test tree
# Load from newick
=
Accessing data
The dm method generates a symmetrical numpy distance matrix and returns a tuple of
keys in the matrix row/column order:
# Create a test tree
# Load from Newick file
=
# Or, load from a Dendropy object
=
=
# Calculate the PDM
=
=
"""
/------------[4]------------ A
+
| /---------[3]--------- B
\---[1]---+
\------------[4]------------- C
labels = ('A', 'B', 'C')
dm = [[0. 8. 9.]
[8. 0. 7.]
[9. 7. 0.]]
"""
Normalisation
If true, the data will be returned as normalised by the sum of all edges in the tree.
Performance
Tests were executed using the scripts/phylodm_perf.py script with 10 trials.
These tests demonstrate that PhyloDM is more efficient than DendroPy's phylogenetic distance matrix when there are over 500 taxa in the tree. If there are less than 500 taxa, then use DendroPy for all of the great features it provides.
With 10,000 taxa in the tree, each program uses approximately:
- PhyloDM = 4 seconds / 2 GB memory
- DendroPy = 17 minutes / 90 GB memory


Changelog
2.0.0
- Re-write in Rust (2x faster)
1.3.1
- Use OpenMP to parallelize PDM methods.
1.3.0
- Removed tqdm.
- get_matrix() is now 3x faster.
1.2.0
- Addded the remove_keys command.
1.1.0
- Significant improvement in PDM construction time using C.
1.0.0
- Initial release.
Citing
Please cite this software if you use it in your work.