Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and...
-
Upload
cordelia-harmon -
Category
Documents
-
view
217 -
download
0
Transcript of Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and...
![Page 1: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/1.jpg)
Constructing evolutionary trees from rooted triples
Bang Ye Wu
Dept. of Computer Science and Information Engineering
Shu-Te University
![Page 2: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/2.jpg)
An evolutionary tree A rooted tree Each leaf represents one species. Internal nodes are unlabelled. (inferred
common ancestors)
a b c d e f
![Page 3: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/3.jpg)
A (rooted) triple (triplet) An evolutionary tree of 3 species. A constraint in an evolutionary tree construction
problem. (c(ab)): lca(b,c)=lca(c,a)lca(a,b)
lca : lowest common ancestor : “is an ancestor of “
a,b should be closer than a,c or b,c.
a b c
![Page 4: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/4.jpg)
A tree compatible with triples
Given a set of triples, construct a tree satisfying all the triples.
If such a tree exists, the problem is polynomial time solvable. [Aho et al, 1981]
a d b cab cca dba d
![Page 5: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/5.jpg)
Incompatible (conflicting) triples
ab c ba c
Two conflicting triples
ab c bd c db a
Three conflicting triples (pairwise compatible)
![Page 6: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/6.jpg)
Two optimization problems
The maximum consensus tree: – the tree satisfying maximum number of triples.– NP-hard [Jansson, 2001][Wu, to appear]– A new heuristic algorithm [this paper]
The maximum compatible set:– The compatible species subset of maximum
cardinality. – NP-hard [this paper]
![Page 7: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/7.jpg)
Previous heuristicBest-One-Split-First
If a species x is split from a set V, all triples (x(v1v2)), v1 and v2 in V, will be satisfied.
Repeatedly split one species from the set. Choose the split species greedily.
![Page 8: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/8.jpg)
triples: (a(bc)),(c(ad)),(b(ad)),(c(bd))
{a,b,d}
cb
{a,d}c
da b c
c is chosen, and the two triples is satisfied.
c is split
b is split
![Page 9: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/9.jpg)
Previous heuristicMin-Cut-Split-First
Construct an auxiliary graph:– Vertex: species– Each edge is labeled by a set: for each
triple (x(yz)), x is in the label set of edge (y,z).
c
b
ca
d
a
b,c triples: (a(bc)),(c(ad)),(b(ad)),(c(bd))
![Page 10: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/10.jpg)
– A bipartition corresponds to a split in the tree.– The label in the cut of the bipartition corresponds to
the triples conflicting the split. Repeatedly find the bipartition with minimum
cut.
{a,d} {b,c}
a d b cc
b
ca
d
a
b,c
a min-cut, triple (c(bd)) is conflicting
![Page 11: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/11.jpg)
Previous heuristicBest-Pair-Merge-First
Instead of top-down splitting, BPMF uses the bottom-up merging strategy.
Starting from sets of singleton, we repeatedly merge the sets step by step.
Scoring functions are used to evaluate which pair should be merged in each step.
![Page 12: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/12.jpg)
triples: (a(bc)),(c(ad)),(b(ad)),(c(bd))
{a} {b} {c} {d}
{a,d} {b} {c}
{a,d} {b,c}
{a,d,b,c}
a d
a d b c
a d b c
![Page 13: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/13.jpg)
An exact algorithm for MCTT
Dynamic programming F(V)=max{F(V1)+F(V2)+W(V1,V2)},
taken among all bipartition (V1,V2) of V.– F(V): # of satisfied triples over V.– W(V1,V2): # of (x(v1v2) for x not in V and
v1, v2 in V1, V2 respectively. Computed with cardinality from small
to large.
![Page 14: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/14.jpg)
n=4 abcd3
n=3 abc1
abd3
bcd2
n=2 ab0
ac0
ad2
bc1
bd1
cd0
n=1 a0
b0
c0
d0
ab c ca d ba d cb d
a d b c
![Page 15: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/15.jpg)
Our new heuristic algorithm (DPWP)
Derived from the exact algorithm. The number of subsets of each
cardinality is limited by a parameter K. When K=infinity, it is just the exact
algorithm. Time-quality trade-off. The time complexity is O(n2k2(n3+k)).
– Sorry, there is a mistake in the paper.
![Page 16: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/16.jpg)
The experiment results (time)
1
10
100
1000
10000
12 15 18 20 24 27 30
n
tim
e (s
ec) Exact
DPWP(300)
DPWP(600)
DPWP(900)
![Page 17: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/17.jpg)
Average ratio in the test
0.80.850.9
0.951
1.051.1
1.151.2
12 15 18 20
ratio
BPMF
BOSF
MCSF
DWDP(300)
DWDP(600)
DWDP(900)
![Page 18: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/18.jpg)
Worst ratio in the test
0.80.9
11.11.21.31.41.51.6
12 15 18 20
ratio
BPMF
BOSF
MCSF
DWDP(300)
DWDP(600)
DWDP(900)
![Page 19: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/19.jpg)
Improvement100*(DPWP - BestofOther)/BestofOther
0
5
10
15
20
18 20 24 27 30
n
(%) Max
average
![Page 20: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/20.jpg)
The MCST problem Given triples over species set S, find a
subset U of S such that all given triples over U is compatible and |U| is maximum.
We show the problem is NP-hard.– Transformed from the Feedback Vertex
Set problem.
![Page 21: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/21.jpg)
The feedback vertex set problem
Feedback vertex set: a vertex subset containing at one vertex of each cycle of the given directed graph.– In other words, removing a feedback
vertex set results in an acyclic digraph.
![Page 22: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/22.jpg)
The reduction
T 1
T 2
T p
....
x
rp
r1
r2
....
V p
V 3
V 2
V 1
x 1 ,x 2 ,...
![Page 23: Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University.](https://reader030.fdocuments.net/reader030/viewer/2022032709/56649eda5503460f94be94e3/html5/thumbnails/23.jpg)
Concluding remarks What is the approximation ratio?
– The Best-One-Split-First algorithm is a 3-approximation algorithm,
– The larger K give us better solution, but we do not know the theoretic bound of the ratio.