Algebraic Phylogenetics
The Algebraic Phylogenetics part of OSCAR provides functionality for
- specifying phylogenetic models
- parametrizing such models
- calculating their algebraic invariants
Models
The five most common models in algebraic phylogenetics can automatically be specified by calling the below functions, each taking a tree as input and attaching transition matrices to its edges as defined by Jukes Cantor, Kimura, etc., respectively, returning a structure PhylogeneticModel
or GroupBasedPhylogeneticModel
.
cavender_farris_neyman_model
— Methodcavender_farris_neyman_model(graph::Graph{Directed})
Creates a PhylogeneticModel
based on graph
whose transition matrices are of type Cavender-Farris-Neyman.
Examples
julia> cavender_farris_neyman_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]))
Group-based phylogenetic model on a tree with 3 leaves and 3 edges
with distribution at the root [1//2, 1//2].
The transition matrix associated to edge i is of the form
[a[i] b[i];
b[i] a[i]],
and the Fourier parameters are [x[i, 1] x[i, 2]].
This function is part of the experimental code in Oscar. Please read here for more details.
jukes_cantor_model
— Methodjukes_cantor_model(graph::Graph{Directed})
Creates a PhylogeneticModel
based on graph
whose transition matrices are Jukes Cantor matrices.
Examples
julia> jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]))
Group-based phylogenetic model on a tree with 3 leaves and 3 edges
with distribution at the root [1//4, 1//4, 1//4, 1//4].
The transition matrix associated to edge i is of the form
[a[i] b[i] b[i] b[i];
b[i] a[i] b[i] b[i];
b[i] b[i] a[i] b[i];
b[i] b[i] b[i] a[i]],
and the Fourier parameters are [x[i, 1] x[i, 2] x[i, 2] x[i, 2]].
This function is part of the experimental code in Oscar. Please read here for more details.
kimura2_model
— Methodkimura2_model(graph::Graph{Directed})
Creates a PhylogeneticModel
based on graph
whose transition matrices are Kimura 2-parameter matrices.
Examples
julia> kimura2_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]))
Group-based phylogenetic model on a tree with 3 leaves and 3 edges
with distribution at the root [1//4, 1//4, 1//4, 1//4].
The transition matrix associated to edge i is of the form
[a[i] b[i] c[i] b[i];
b[i] a[i] b[i] c[i];
c[i] b[i] a[i] b[i];
b[i] c[i] b[i] a[i]],
and the Fourier parameters are [x[i, 1] x[i, 3] x[i, 2] x[i, 2]].
This function is part of the experimental code in Oscar. Please read here for more details.
kimura3_model
— Methodkimura3_model(graph::Graph{Directed})
Creates a PhylogeneticModel
based on graph
whose transition matrices are Kimura 3-parameter matrices.
Examples
julia> kimura3_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]))
Group-based phylogenetic model on a tree with 3 leaves and 3 edges
with distribution at the root [1//4, 1//4, 1//4, 1//4].
The transition matrix associated to edge i is of the form
[a[i] b[i] c[i] d[i];
b[i] a[i] d[i] c[i];
c[i] d[i] a[i] b[i];
d[i] c[i] b[i] a[i]],
and the Fourier parameters are [x[i, 1] x[i, 2] x[i, 3] x[i, 4]].
This function is part of the experimental code in Oscar. Please read here for more details.
general_markov_model
— Methodgeneral_markov_model(graph::Graph{Directed})
Creates a PhylogeneticModel
based on graph
whose transition matrices are stochastic with no further constraints.
Examples
julia> general_markov_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]))
Phylogenetic model on a tree with 3 leaves and 3 edges
with distribution at the root [π[1], π[2], π[3], π[4]]
and transition matrix associated to edge i of the form
[m[i, 1, 1] m[i, 1, 2] m[i, 1, 3] m[i, 1, 4];
m[i, 2, 1] m[i, 2, 2] m[i, 2, 3] m[i, 2, 4];
m[i, 3, 1] m[i, 3, 2] m[i, 3, 3] m[i, 3, 4];
m[i, 4, 1] m[i, 4, 2] m[i, 4, 3] m[i, 4, 4]].
This function is part of the experimental code in Oscar. Please read here for more details.
The models are by default in projective space. For their affine versions, call
affine_phylogenetic_model!
— Methodaffine_phylogenetic_model!(pm::PhylogeneticModel)
Moves a PhylogeneticModel
or GroupBasedPhylogeneticModel
from projective into affine space.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> affine_phylogenetic_model!(pm)
Group-based phylogenetic model on a tree with 3 leaves and 3 edges
with distribution at the root [1//4, 1//4, 1//4, 1//4].
The transition matrix associated to edge i is of the form
[-3*b[i]+1 b[i] b[i] b[i];
b[i] -3*b[i]+1 b[i] b[i];
b[i] b[i] -3*b[i]+1 b[i];
b[i] b[i] b[i] -3*b[i]+1],
and the Fourier parameters are [1 x[i, 2] x[i, 2] x[i, 2]].
This function is part of the experimental code in Oscar. Please read here for more details.
Components of a model
PhylogeneticModel
specifies any phylogenetic tree model in probability coordinates and GroupBasedPhylogeneticModel
can specify group-based, e.g. Fourier, coordinates. For any model, we can call its graph, transition matrices attached to the graph's edges, the number of states of each vertex random variable, and the corresponding polynomial rings. For instance for Jukes Cantor on the star with three leaves:
graph
— Methodgraph(pm::PhylogeneticModel)
Return the graph of a PhylogeneticModel
pm
.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> graph(pm)
Directed graph with 4 nodes and the following edges:
(4, 1)(4, 2)(4, 3)
This function is part of the experimental code in Oscar. Please read here for more details.
transition_matrices
— Methodtransition_matrices(pm::PhylogeneticModel)
Return a dictionary between the edges of the tree specifying the PhylogeneticModel
pm
and their attached transition matrices.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> transition_matrices(pm)
Dict{Edge, MatElem{QQMPolyRingElem}} with 3 entries:
Edge(4, 1) => [a[1] b[1] b[1] b[1]; b[1] a[1] b[1] b[1]; b[1] b[1] a[1] b[1];…
Edge(4, 2) => [a[2] b[2] b[2] b[2]; b[2] a[2] b[2] b[2]; b[2] b[2] a[2] b[2];…
Edge(4, 3) => [a[3] b[3] b[3] b[3]; b[3] a[3] b[3] b[3]; b[3] b[3] a[3] b[3];…
This function is part of the experimental code in Oscar. Please read here for more details.
number_states
— Methodnumber_states(pm::PhylogeneticModel)
Return the number of states of the PhylogeneticModel
pm
.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> number_states(pm)
4
This function is part of the experimental code in Oscar. Please read here for more details.
probability_ring
— Methodprobability_ring(pm::PhylogeneticModel)
Return the ring of probability coordinates of the PhylogeneticModel
pm
.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> probability_ring(pm)
Multivariate polynomial ring in 6 variables a[1], a[2], a[3], b[1], ..., b[3]
over rational field
This function is part of the experimental code in Oscar. Please read here for more details.
fourier_ring
— Methodfourier_ring(pm::GroupBasedPhylogeneticModel)
Return the ring of Fourier coordinates of the PhylogeneticModel
pm
.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> fourier_ring(pm)
Multivariate polynomial ring in 6 variables x[1, 1], x[2, 1], x[3, 1], x[1, 2], ..., x[3, 2]
over rational field
This function is part of the experimental code in Oscar. Please read here for more details.
fourier_parameters
— Methodfourier_parameters(pm::GroupBasedPhylogeneticModel)
Return the Fourier parameters of the GroupBasedPhylogeneticModel
pm
as a vector of eigenvalues of the transition matrices.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> fourier_parameters(pm)
Dict{Edge, Vector{QQMPolyRingElem}} with 3 entries:
Edge(4, 1) => [x[1, 1], x[1, 2], x[1, 2], x[1, 2]]
Edge(4, 2) => [x[2, 1], x[2, 2], x[2, 2], x[2, 2]]
Edge(4, 3) => [x[3, 1], x[3, 2], x[3, 2], x[3, 2]]
This function is part of the experimental code in Oscar. Please read here for more details.
group_of_model
— Methodgroup_of_model(pm::GroupBasedPhylogeneticModel)
Returns the group the GroupBasedPhylogeneticModel
pm
is based on.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> group_of_model(pm)
4-element Vector{FinGenAbGroupElem}:
[0, 0]
[0, 1]
[1, 0]
[1, 1]
This function is part of the experimental code in Oscar. Please read here for more details.
Parametrization
For each phylogenetic model, we can calculate the parametrization, a map from transition matrices to probabilities, parametrized in probability or Fourier coordinates. For group-based models, we can reparametrize between these and return the transformation matrix, and we can calculate equivalence classes of probabilities with the same parametrization.
probability_map
— Methodprobability_map(pm::PhylogeneticModel)
Create a parametrization for a PhylogeneticModel
of type Dictionary
.
Iterate through all possible states of the leaf random variables and calculates their corresponding probabilities using the root distribution and laws of conditional independence. Return a dictionary of polynomials indexed by the states. Use auxiliary function monomial_parametrization(pm::PhylogeneticModel, states::Dict{Int, Int})
and probability_parametrization(pm::PhylogeneticModel, leaves_states::Vector{Int})
.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> p = probability_map(pm)
Dict{Tuple{Vararg{Int64}}, QQMPolyRingElem} with 64 entries:
(1, 2, 1) => 1//4*a[1]*a[3]*b[2] + 1//4*a[2]*b[1]*b[3] + 1//2*b[1]*b[2]*b[3]
(3, 1, 1) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*a[3]*b[1] + 1//2*b[1]*b[2]*b[3]
(4, 4, 2) => 1//4*a[1]*a[2]*b[3] + 1//4*a[3]*b[1]*b[2] + 1//2*b[1]*b[2]*b[3]
(1, 2, 3) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(3, 1, 3) => 1//4*a[1]*a[3]*b[2] + 1//4*a[2]*b[1]*b[3] + 1//2*b[1]*b[2]*b[3]
(3, 2, 4) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(3, 2, 1) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(2, 1, 4) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(3, 2, 3) => 1//4*a[1]*a[3]*b[2] + 1//4*a[2]*b[1]*b[3] + 1//2*b[1]*b[2]*b[3]
(2, 1, 1) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*a[3]*b[1] + 1//2*b[1]*b[2]*b[3]
(1, 3, 2) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(1, 4, 2) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(2, 1, 3) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(2, 2, 4) => 1//4*a[1]*a[2]*b[3] + 1//4*a[3]*b[1]*b[2] + 1//2*b[1]*b[2]*b[3]
(4, 3, 4) => 1//4*a[1]*a[3]*b[2] + 1//4*a[2]*b[1]*b[3] + 1//2*b[1]*b[2]*b[3]
(2, 2, 1) => 1//4*a[1]*a[2]*b[3] + 1//4*a[3]*b[1]*b[2] + 1//2*b[1]*b[2]*b[3]
(4, 4, 4) => 1//4*a[1]*a[2]*a[3] + 3//4*b[1]*b[2]*b[3]
(4, 3, 1) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(3, 3, 2) => 1//4*a[1]*a[2]*b[3] + 1//4*a[3]*b[1]*b[2] + 1//2*b[1]*b[2]*b[3]
⋮ => ⋮
This function is part of the experimental code in Oscar. Please read here for more details.
fourier_map
— Methodfourier_map(pm::GroupBasedPhylogeneticModel)
Create a parametrization for a GroupBasedPhylogeneticModel
of type Dictionary
.
Iterate through all possible states of the leaf random variables and calculates their corresponding probabilities using group actions and laws of conditional independence. Return a dictionary of polynomials indexed by the states. Use auxiliary function monomial_fourier(pm::GroupBasedPhylogeneticModel, leaves_states::Vector{Int})
and fourier_parametrization(pm::GroupBasedPhylogeneticModel, leaves_states::Vector{Int})
.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> q = fourier_map(pm)
Dict{Tuple{Vararg{Int64}}, QQMPolyRingElem} with 64 entries:
(1, 2, 1) => 0
(3, 1, 1) => 0
(4, 4, 2) => 0
(1, 2, 3) => 0
(3, 1, 3) => x[2, 1]*x[1, 2]*x[3, 2]
(3, 2, 4) => x[1, 2]*x[2, 2]*x[3, 2]
(3, 2, 1) => 0
(2, 1, 4) => 0
(3, 2, 3) => 0
(2, 1, 1) => 0
(1, 3, 2) => 0
(1, 4, 2) => 0
(2, 1, 3) => 0
(2, 2, 4) => 0
(4, 3, 4) => 0
(2, 2, 1) => x[3, 1]*x[1, 2]*x[2, 2]
(4, 4, 4) => 0
(4, 3, 1) => 0
(3, 3, 2) => 0
⋮ => ⋮
This function is part of the experimental code in Oscar. Please read here for more details.
compute_equivalent_classes
— Methodcompute_equivalent_classes(parametrization::Dict{Tuple{Vararg{Int64}}, QQMPolyRingElem})
Given the parametrization of a PhylogeneticModel
, cancel all duplicate entries and return equivalence classes of states which are attached the same probabilities.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> p = probability_map(pm);
julia> q = fourier_map(pm);
julia> p_equivclasses = compute_equivalent_classes(p);
julia> p_equivclasses.parametrization
Dict{Tuple{Vararg{Int64}}, QQMPolyRingElem} with 5 entries:
(1, 2, 1) => 1//4*a[1]*a[3]*b[2] + 1//4*a[2]*b[1]*b[3] + 1//2*b[1]*b[2]*b[3]
(1, 1, 1) => 1//4*a[1]*a[2]*a[3] + 3//4*b[1]*b[2]*b[3]
(1, 2, 2) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*a[3]*b[1] + 1//2*b[1]*b[2]*b[3]
(1, 2, 3) => 1//4*a[1]*b[2]*b[3] + 1//4*a[2]*b[1]*b[3] + 1//4*a[3]*b[1]*b[2] …
(1, 1, 2) => 1//4*a[1]*a[2]*b[3] + 1//4*a[3]*b[1]*b[2] + 1//2*b[1]*b[2]*b[3]
julia> p_equivclasses.classes
Dict{Tuple{Vararg{Int64}}, Vector{Tuple{Vararg{Int64}}}} with 5 entries:
(1, 2, 1) => [(1, 2, 1), (1, 3, 1), (1, 4, 1), (2, 1, 2), (2, 3, 2), (2, 4, 2…
(1, 1, 1) => [(1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4)]
(1, 2, 2) => [(1, 2, 2), (1, 3, 3), (1, 4, 4), (2, 1, 1), (2, 3, 3), (2, 4, 4…
(1, 2, 3) => [(1, 2, 3), (1, 2, 4), (1, 3, 2), (1, 3, 4), (1, 4, 2), (1, 4, 3…
(1, 1, 2) => [(1, 1, 2), (1, 1, 3), (1, 1, 4), (2, 2, 1), (2, 2, 3), (2, 2, 4…
julia> q_equivclasses = compute_equivalent_classes(q);
julia> q_equivclasses.parametrization
Dict{Tuple{Vararg{Int64}}, QQMPolyRingElem} with 5 entries:
(1, 1, 1) => x[1, 1]*x[2, 1]*x[3, 1]
(2, 3, 4) => x[1, 2]*x[2, 2]*x[3, 2]
(2, 2, 1) => x[3, 1]*x[1, 2]*x[2, 2]
(1, 2, 2) => x[1, 1]*x[2, 2]*x[3, 2]
(2, 1, 2) => x[2, 1]*x[1, 2]*x[3, 2]
This function is part of the experimental code in Oscar. Please read here for more details.
sum_equivalent_classes
— Methodsum_equivalent_classes(equivalent_classes::NamedTuple{(:parametrization, :classes), Tuple{Dict{Tuple{Vararg{Int64}}, QQMPolyRingElem}, Dict{Tuple{Vararg{Int64}}, Vector{Tuple{Vararg{Int64}}}}}})
Take the output of the function compute_equivalent_classes
for PhylogeneticModel
and multiply by a factor to obtain probabilities as specified on the original small trees database.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> p = probability_map(pm);
julia> p_equivclasses = compute_equivalent_classes(p);
julia> sum_equivalent_classes(p_equivclasses)
Dict{Tuple{Int64, Int64, Int64}, QQMPolyRingElem} with 5 entries:
(1, 2, 1) => 3*a[1]*a[3]*b[2] + 3*a[2]*b[1]*b[3] + 6*b[1]*b[2]*b[3]
(1, 1, 1) => a[1]*a[2]*a[3] + 3*b[1]*b[2]*b[3]
(1, 2, 2) => 3*a[1]*b[2]*b[3] + 3*a[2]*a[3]*b[1] + 6*b[1]*b[2]*b[3]
(1, 2, 3) => 6*a[1]*b[2]*b[3] + 6*a[2]*b[1]*b[3] + 6*a[3]*b[1]*b[2] + 6*b[1]*…
(1, 1, 2) => 3*a[1]*a[2]*b[3] + 3*a[3]*b[1]*b[2] + 6*b[1]*b[2]*b[3]
This function is part of the experimental code in Oscar. Please read here for more details.
specialized_fourier_transform
— Methodspecialized_fourier_transform(pm::GroupBasedPhylogeneticModel, p_classes::Dict{Tuple{Vararg{Int64}}, Vector{Tuple{Vararg{Int64}}}}, q_classes::Dict{Tuple{Vararg{Int64}}, Vector{Tuple{Vararg{Int64}}}})
Reparametrize between a model specification in terms of probability and Fourier cooordinates. The input of equivalent classes is optional, if they are not entered they will be computed.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> p_equivclasses = compute_equivalent_classes(probability_map(pm));
julia> q_equivclasses = compute_equivalent_classes(fourier_map(pm));
julia> specialized_fourier_transform(pm, p_equivclasses.classes, q_equivclasses.classes)
5×5 Matrix{QQMPolyRingElem}:
1 1 1 1 1
1 -1//3 -1//3 1 -1//3
1 -1//3 1 -1//3 -1//3
1 1 -1//3 -1//3 -1//3
1 -1//3 -1//3 -1//3 1//3
This function is part of the experimental code in Oscar. Please read here for more details.
inverse_specialized_fourier_transform
— Methodinverse_specialized_fourier_transform(pm::GroupBasedPhylogeneticModel, p_classes::Dict{Tuple{Vararg{Int64}}, Vector{Tuple{Vararg{Int64}}}}, q_classes::Dict{Tuple{Vararg{Int64}}, Vector{Tuple{Vararg{Int64}}}})
Reparametrize between a model specification in terms of Fourier and probability cooordinates.
Examples
julia> pm = jukes_cantor_model(graph_from_edges(Directed,[[4,1],[4,2],[4,3]]));
julia> p_equivclasses = compute_equivalent_classes(probability_map(pm));
julia> q_equivclasses = compute_equivalent_classes(fourier_map(pm));
julia> inverse_specialized_fourier_transform(pm, p_equivclasses.classes, q_equivclasses.classes)
5×5 Matrix{QQMPolyRingElem}:
1//16 3//16 3//16 3//16 3//8
3//16 -3//16 -3//16 9//16 -3//8
3//16 -3//16 9//16 -3//16 -3//8
3//16 9//16 -3//16 -3//16 -3//8
3//8 -3//8 -3//8 -3//8 3//4
This function is part of the experimental code in Oscar. Please read here for more details.
Contact
Please direct questions about this part of OSCAR to the following people:
- Marina Garrote López,
- possibly others.
You can ask questions in the OSCAR Slack. Alternatively, you can raise an issue on github.