The history of the Indo-Europeans Tandy Warnow The University of Texas at Austin.
Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao,...
-
Upload
abel-lindsey -
Category
Documents
-
view
214 -
download
0
description
Transcript of Algorithms research Tandy Warnow UT-Austin. “Algorithms group” UT-Austin: Warnow, Hunt UCB: Rao,...
Algorithms research
Tandy WarnowUT-Austin
“Algorithms group”
• UT-Austin: Warnow, Hunt• UCB: Rao, Karp, Papadimitriou, Russell,
Myers• UCSD: Huelsenbeck• UNM: Moret, Bader, Williams• External participants: Mossel (UCB),
Huson (Germany), Steel (NZ), and others
Main research foci
• Solving maximum parsimony and maximum likelihood more effectively
• “Fast converging methods”• Gene order and content phylogeny• Reticulate evolution• Multiple sequence alignment at the genomic
level
GRAPPA (Genome Rearrangement Analysis under Parsimony and other
Phylogenetic Algorithms)http://www.cs.unm.edu/~moret/GRAPPA/• Heuristics for NP-hard optimization problems• Fast polynomial time distance-based methods• Contributors: U. New Mexico,U. Texas at
Austin, Universitá di Bologna, Italy• Poster: Jijun Tang
Maximum Parsimony on Rearranged Genomes (MPRG)
• The leaves are rearranged genomes.• Find the tree that minimizes the total number of rearrangement events
A
B
C
D
3 6
2
3
4
A
B
C
D
E F
Total length= 18
Benchmark gene order dataset: Campanulaceae
• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion
phylogenies
1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)
Benchmark gene order dataset: Campanulaceae
• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion
phylogenies
1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)2000: Using GRAPPA v1.1 on the 512-processor Los Lobos
Supercluster machine: 2 minutes (200,000-fold speedup per processor)
Benchmark gene order dataset: Campanulaceae
• 12 genomes + 1 outgroup (Tobacco), 105 gene segments• NP-hard optimization problems: breakpoint and inversion
phylogenies
1997: BPAnalysis (Blanchette and Sankoff): 200 years (est.)2000: Using GRAPPA v1.1 on the 512-processor Los Lobos
Supercluster machine: 2 minutes (200,000-fold speedup per processor)
2003: Using latest version of GRAPPA: 2 minutes on a single processor (1-billion-fold speedup per processor)
Reticulate Evolution
• Group leader: Randy Linder• Software: (1) producing random networks,
(2) simulating sequences down networks, (3) performance evaluation of methods (4) inferring reticulate networks
• Current reconstruction methods limited to one reticulation event
• Poster: Luay Nakhleh
20-taxon 1-hybrid network. 0.1 scaling factor.
MP/ML heuristics
• Disk-Covering Methods (DCMs): Divide-and-conquer strategies that boosting the performance of base methods for MP/ML (Warnow)
• Mr Bayes (Huelsenbeck)• New I-DCM3 technique improves upon the
Ratchet and TBR• Poster: Usman Roshan (DCM-MP)
Gutell dataset: 854 rRNA sequences
Iterative-DCM3 trials find trees of MP score 103210 in 30 hours,whereas ratchet500 trials take 45 hours to find trees of same score
Other planned projects (partial list)
• Multiple Sequence Alignment (Myers and Williams)
• Steiner Tree algorithms - error bounds and new heuristics (Rao)
• MCMC methods (Russell and Huelsenbeck)• Symbolic representation of data (Hunt)• Parallel algorithms (Bader and Williams)
Questions for group
• How should we measure performance?• How should we use simulated data? • How should we use real datasets?• How can we study criteria (MP, ML, etc.) as
opposed to methods?• Should we sponsor DIMACS-style challenges?• Others? (please bring questions, comments,
answers, to the break-out session)