|
Bioinformatics Research Group |
|
||||||||||
Research:
Graduate studies:
Undergraduate studies:
Lab information:
Photographs
Contact:
E-mail:
Dan Brown |
David Bryant, Vincent Berry, Paul Kearney, Ming Li, Tao Jiang, Todd Wareham, Haoyong Zhang. A Practical Algorithm for Recovering the Best Supported Edges of an Evolutionary Tree. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 287-296, San Francisco, California, January 9-11 2000. Download preprint: 00hyperclean.ps, 240Kb Download from publisher: not available Related www page: not available Bibliography entry: BibTeX Abstract: It is now routine for biologists to conduct evolutionary analyses of large DNA and protein sequence datasets. A computational bottleneck in these analyses is the recovery of the topology of the evolutionary tree for a set of sequences. This paper presents a practical solution to this challenging problem. In particular, a new technique, called hypercleaning, is presented that can be combined with various tree-building algorithms to efficiently reconstruct from sequence data the best supported edges of the evolutionary tree. More precisely, the hypercleaning technique computes from sequence data a small subset of edges that is likely to contain most edges of the correct tree. A tree-building algorithm then attempts to identify edges in the subset that are compatible with each other and hereby produces an evolutionary tree. We also propose a simple greedy agorithm that builds a tree by screening the edges provided by hypercleaning in the decreasing order of support from sequence data. This technique is a substantial improvement over previous algorithms in its ability to recover edges of the evolutionary tree. Hypercleaning also incorporates a detailed error model that relates errors in the data to the topology of the evolutionary tree. The results of a simulation study that strongly support the practicality, efficiency and effectiveness of hypercleaning are also presented. |