### Abstract

Background: In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability. Results: The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in O(n^{2}) time by querying at most (n - 4) log(n - 1) quartet topologies, where n is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - p in O(n^{4}log n) time, where p is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately 1/1+q^{2}+1/2q^{4} +1/16q^{5}, where q=p/1-p with running time of O(n^{5}), which is at least 0.984 when p < 0.05. Conclusion: The three proposed algorithms are mathematically guaranteed to reconstruct the "true " phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.

Original language | English (US) |
---|---|

Article number | 1 |

Journal | Algorithms for Molecular Biology |

Volume | 3 |

Issue number | 1 |

DOIs | |

State | Published - Jan 24 2008 |

### Fingerprint

### ASJC Scopus subject areas

- Computational Theory and Mathematics
- Applied Mathematics
- Molecular Biology
- Structural Biology

### Cite this

*Algorithms for Molecular Biology*,

*3*(1), [1]. https://doi.org/10.1186/1748-7188-3-1

}

*Algorithms for Molecular Biology*, vol. 3, no. 1, 1. https://doi.org/10.1186/1748-7188-3-1

**Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability.** / Wu, Gang; Kao, Ming-Yang; Lin, Guohui; You, Jia Huai.

Research output: Contribution to journal › Article

TY - JOUR

T1 - Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability

AU - Wu, Gang

AU - Kao, Ming-Yang

AU - Lin, Guohui

AU - You, Jia Huai

PY - 2008/1/24

Y1 - 2008/1/24

N2 - Background: In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability. Results: The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in O(n2) time by querying at most (n - 4) log(n - 1) quartet topologies, where n is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - p in O(n4log n) time, where p is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately 1/1+q2+1/2q4 +1/16q5, where q=p/1-p with running time of O(n5), which is at least 0.984 when p < 0.05. Conclusion: The three proposed algorithms are mathematically guaranteed to reconstruct the "true " phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.

AB - Background: In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability. Results: The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in O(n2) time by querying at most (n - 4) log(n - 1) quartet topologies, where n is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - p in O(n4log n) time, where p is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately 1/1+q2+1/2q4 +1/16q5, where q=p/1-p with running time of O(n5), which is at least 0.984 when p < 0.05. Conclusion: The three proposed algorithms are mathematically guaranteed to reconstruct the "true " phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.

UR - http://www.scopus.com/inward/record.url?scp=40649125833&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=40649125833&partnerID=8YFLogxK

U2 - 10.1186/1748-7188-3-1

DO - 10.1186/1748-7188-3-1

M3 - Article

C2 - 18218120

AN - SCOPUS:40649125833

VL - 3

JO - Algorithms for Molecular Biology

JF - Algorithms for Molecular Biology

SN - 1748-7188

IS - 1

M1 - 1

ER -