TY - GEN
T1 - A heuristic for phylogenetic reconstruction using transposition
AU - Yue, Feng
AU - Zhang, Meng
AU - Tang, Jijun
PY - 2007
Y1 - 2007
N2 - Because of the advent of high-throughput sequencing and the consequent reduction in cost of sequencing, many organisms have been completely sequenced and most of their genes identified; homologies among these genes are also getting established. It thus has become possible to represent whole genomes as ordered lists of gene identifiers and to study the evolution of these entities through computational means, in systematics as well as in comparative genomics. As a result, gene order data (also known as genome rearrangement data) has attracted increasing attention from both biologists and computer scientists as a new type of data for phylogenetic analysis. Methods for reconstructing phylogeny from genome rearrangements include distance-based methods, MCMC methods and direct optimization methods. The latter, pioneered by Sankoff and extended in the software packages of GRAPPA and MGR, is the most accurate approach for inversion phylogeny. However, due to the difficulty of computing the transposition distance, this type of methods has not been applied to datasets where transposition is the only or dominant event. In this paper, we present a heuristic transposition median solver and extend GRAPPA to handle transpositions. Our extensive testing using simulated datasets shows that this method (GRAPPA-TP) is very accurate in terms of ancestor genome inference and phylogenetic reconstruction. It also suggests that model match is critical in phylogenetic analysis, and a fast and accurate method for transposition distance computation is still very important. The new GRAPPA-TP is available from phylo.cse.sc.edu.
AB - Because of the advent of high-throughput sequencing and the consequent reduction in cost of sequencing, many organisms have been completely sequenced and most of their genes identified; homologies among these genes are also getting established. It thus has become possible to represent whole genomes as ordered lists of gene identifiers and to study the evolution of these entities through computational means, in systematics as well as in comparative genomics. As a result, gene order data (also known as genome rearrangement data) has attracted increasing attention from both biologists and computer scientists as a new type of data for phylogenetic analysis. Methods for reconstructing phylogeny from genome rearrangements include distance-based methods, MCMC methods and direct optimization methods. The latter, pioneered by Sankoff and extended in the software packages of GRAPPA and MGR, is the most accurate approach for inversion phylogeny. However, due to the difficulty of computing the transposition distance, this type of methods has not been applied to datasets where transposition is the only or dominant event. In this paper, we present a heuristic transposition median solver and extend GRAPPA to handle transpositions. Our extensive testing using simulated datasets shows that this method (GRAPPA-TP) is very accurate in terms of ancestor genome inference and phylogenetic reconstruction. It also suggests that model match is critical in phylogenetic analysis, and a fast and accurate method for transposition distance computation is still very important. The new GRAPPA-TP is available from phylo.cse.sc.edu.
UR - http://www.scopus.com/inward/record.url?scp=47649100233&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47649100233&partnerID=8YFLogxK
U2 - 10.1109/BIBE.2007.4375652
DO - 10.1109/BIBE.2007.4375652
M3 - Conference contribution
AN - SCOPUS:47649100233
SN - 1424415098
SN - 9781424415090
T3 - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
SP - 802
EP - 808
BT - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
T2 - 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
Y2 - 14 January 2007 through 17 January 2007
ER -