Phylogenetic Trees
Introduction
Phylogenetic trees represent the evolutionary history of some species of consideration. Here we consider phylogenetic (rooted) trees with branch lengths as defined in [SS03].
The construction of tropical median consensus trees is one of the nontrivial algorithm here.
Construction
phylogenetic_tree
— Functionphylogenetic_tree(T::Type{<:Union{Float64, QQFieldElem}}, newick::String)
phylogenetic_tree(newick::String)
Construct a rooted phylogenetic tree with Newick representation newick
. T
indicates the numerical type of the edge lengths.
Calling phylogenetic_tree
without T
will default to using QQFieldElem.
Examples
Make a phylogenetic tree with 4 leaves from its Newick representation and print its taxa and cophenetic matrix.
julia> phylo_t = phylogenetic_tree(Float64, "((H:3,(C:1,B:1):2):1,G:4);");
julia> taxa(phylo_t)
4-element Vector{String}:
"B"
"C"
"G"
"H"
julia> cophenetic_matrix(phylo_t)
4×4 Matrix{Float64}:
0.0 2.0 8.0 6.0
2.0 0.0 8.0 6.0
8.0 8.0 0.0 8.0
6.0 6.0 8.0 0.0
julia> typeof(phylogenetic_tree("((H:3,(C:1,B:1):2):1,G:4);"))
Oscar.PhylogeneticTree{QQFieldElem}
phylogenetic_tree(M::Matrix{T}, taxa::Vector{String}) where T <: Union{Float64, QQFieldElem}
Construct a phylogenetic tree with cophenetic matrix M
and taxa taxa
. The matrix M
must be ultrametric, otherwise an error will be thrown.
Examples
Make a phylogenetic tree on 4 taxa with given cophenetic matrix and print one Newick representation.
julia> mat = [0. 2 8 6; 2 0 8 6; 8 8 0 8; 6 6 8 0]
4×4 Matrix{Float64}:
0.0 2.0 8.0 6.0
2.0 0.0 8.0 6.0
8.0 8.0 0.0 8.0
6.0 6.0 8.0 0.0
julia> tax = ["Bonobo", "Chimpanzee", "Gorilla", "Human"]
4-element Vector{String}:
"Bonobo"
"Chimpanzee"
"Gorilla"
"Human"
julia> tree_mat = phylogenetic_tree(mat, tax);
julia> newick(tree_mat)
"Gorilla:4,(Human:3,(Bonobo:1,Chimpanzee:1):2):1;"
phylogenetic_tree(T::Type{<:Union{Float64, QQFieldElem}}, g::Graph{Directed}; check=true)
phylogenetic_tree(g::Graph{Directed}; check=true)
Constructs a phylogenetic tree from a directed graph. If check
is set to true
we check that g
is a tree. Calling phylogenetic_tree
without a type T
will default to QQFieldelem
.
If the graph is labeled on its leaves using the label name :leaves
and or if the graph is labeled on its edges with the label name :distance
then this data will persist to the PhylogeneticTree
.
Examples
julia> tree = graph_from_edges(Directed,[[4,1],[4,2],[4,3], [5, 4], [5, 6]])
Directed graph with 6 nodes and the following edges:
(4, 1)(4, 2)(4, 3)(5, 4)(5, 6)
julia> pt = phylogenetic_tree(tree)
Phylogenetic tree with QQFieldElem type coefficients
julia> newick(pt)
"(leaf 2:1,leaf 3:1,leaf 1:1):1,leaf 4:1;"
julia> label!(tree, Dict((5, 6) => 1.0,
(5, 4) => 2.0,
(4, 1) => 3.0,
(4, 2) => 4.0,
(4, 3) => 5.0), nothing; name=:distance)
Directed graph with 6 nodes and the following labeling(s):
label: distance
(4, 1) -> 3.0
(4, 2) -> 4.0
(4, 3) -> 5.0
(5, 4) -> 2.0
(5, 6) -> 1.0
julia> label!(tree, nothing, Dict(1 => "a", 2 => "b", 3 => "c", 6 => "d"); name=:leaves)
Directed graph with 6 nodes and the following labeling(s):
label: leaves
1 -> a
2 -> b
3 -> c
4 ->
5 ->
6 -> d
label: distance
(4, 1) -> 3.0
(4, 2) -> 4.0
(4, 3) -> 5.0
(5, 4) -> 2.0
(5, 6) -> 1.0
julia> pt = phylogenetic_tree(Float64, tree)
Phylogenetic tree with Float64 type coefficients
julia> cophenetic_matrix(pt)
4×4 Matrix{Float64}:
0.0 7.0 8.0 6.0
7.0 0.0 9.0 7.0
8.0 9.0 0.0 8.0
6.0 7.0 8.0 0.0
tropical_median_consensus
— Functiontropical_median_consensus(arr::Vector{PhylogeneticTree{T}})
Compute the tropical median consensus tree of the equidistant phylogenetic trees from the vector arr
. The output is then equidistant, too. The method is robust (ie., the consensus tree dpends continuosuly on the input), and it is Pareto and co-Pareto on rooted triplets.
The algorithm, based on tropical convexity and optimal transport, is explained in [CJ24].
Examples
Compute the tropical median consensus of three trees and print one of its Newick representations.
julia> t1 = phylogenetic_tree(Float64, "((H:30,(C:10,B:10):20):10,G:40);");
julia> t2 = phylogenetic_tree(Float64, "(((H:10,C:10):20,B:30):10,G:40);");
julia> t3 = phylogenetic_tree(Float64, "((H:25,C:25):15,(B:15,G:15):25);");
julia> arr = [t1, t2, t3];
julia> tc = tropical_median_consensus(arr);
julia> newick(tc)
"G:40,(B:35,(C:30,H:30):5):5;"
tropical_median_consensus(trees::Vararg{PhylogeneticTree, N}) where {N}
Compute the tropical median consensus tree of any number of phylogenetic trees given as parameters.
Examples
Compute the tropical median consensus of three trees and print one of its Newick representations.
julia> t1 = phylogenetic_tree(Float64, "((H:30,(C:10,B:10):20):10,G:40);");
julia> t2 = phylogenetic_tree(Float64, "(((H:10,C:10):20,B:30):10,G:40);");
julia> t3 = phylogenetic_tree(Float64, "((H:25,C:25):15,(B:15,G:15):25);");
julia> tc = tropical_median_consensus(t1, t2, t3);
julia> newick(tc)
"G:40,(B:35,(C:30,H:30):5):5;"
Some Helpful Functions
adjacency_tree
— Functionadjacency_tree(ptree::PhylogeneticTree)
Return the underlying graph of the phylogenetic tree ptree
.
Examples
Make a phylogenetic tree with given Newick format and print its underlying graph.
julia> ptree = phylogenetic_tree(Float64, "((H:3,(C:1,B:1):2):1,G:4);");
julia> adjacency_tree(ptree)
Directed graph with 7 nodes and the following labeling(s):
label: leaves
1 ->
2 ->
3 -> H
4 ->
5 -> C
6 -> B
7 -> G
label: distance
(1, 2) -> 1.0
(1, 7) -> 4.0
(2, 3) -> 3.0
(2, 4) -> 2.0
(4, 5) -> 1.0
(4, 6) -> 1.0
is_equidistant
— Functionis_equidistant(ptree::PhylogeneticTree)
Check if the phylogenetic tree ptree
is equidistant.
Examples
Make a phylogenetic tree with given Newick format and check if it is equidistant.
julia> ptree = phylogenetic_tree(Float64, "((H:3,(C:1,B:1):2):1,G:4);");
julia> is_equidistant(ptree)
true
cophenetic_matrix
— Functioncophenetic_matrix(ptree::PhylogeneticTree)
Return the cophenetic matrix of the phylogenetic tree ptree
.
Examples
Make a phylogenetic tree with given Newick format and print its cophenetic matrix.
julia> ptree = phylogenetic_tree(Float64, "((H:3,(C:1,B:1):2):1,G:4);");
julia> cophenetic_matrix(ptree)
4×4 Matrix{Float64}:
0.0 2.0 8.0 6.0
2.0 0.0 8.0 6.0
8.0 8.0 0.0 8.0
6.0 6.0 8.0 0.0
taxa
— Functiontaxa(ptree::PhylogeneticTree)
Return the taxa of the phylogenetic tree ptree
.
Examples
Make a phylogenetic tree with given Newick format and print its taxa.
julia> ptree = phylogenetic_tree(Float64, "((H:3,(C:1,B:1):2):1,G:4);");
julia> taxa(ptree)
4-element Vector{String}:
"B"
"C"
"G"
"H"
newick
— Functionnewick(ptree::PhylogeneticTree)
Return a Newick representation of the phylogenetic tree ptree
.
Examples
Make a phylogenetic tree from a matrix and print a Newick representation of it.
julia> mat = [0. 2 8 6; 2 0 8 6; 8 8 0 8; 6 6 8 0]
4×4 Matrix{Float64}:
0.0 2.0 8.0 6.0
2.0 0.0 8.0 6.0
8.0 8.0 0.0 8.0
6.0 6.0 8.0 0.0
julia> tax = ["Bonobo", "Chimpanzee", "Gorilla", "Human"]
4-element Vector{String}:
"Bonobo"
"Chimpanzee"
"Gorilla"
"Human"
julia> tree_mat = phylogenetic_tree(mat, tax);
julia> newick(tree_mat)
"Gorilla:4,(Human:3,(Bonobo:1,Chimpanzee:1):2):1;"