TY - JOUR
T1 - Multiple sequence alignment
AU - Bacon, David J.
AU - Anderson, Wayne F.
N1 - Copyright:
Copyright 2014 Elsevier B.V., All rights reserved.
PY - 1986/9/20
Y1 - 1986/9/20
N2 - A method has been developed for aligning segments of several sequences at once. The number of search steps depends only polynomially on the number of sequences, instead of exponentially, because most alignments are rejected without being evaluated explicitly. A data structure herein called the "heap" facilitates this process. For a set of n sequence segments, the overall similarity is taken to be the sum of all the constituent segment pair similarities, which are in turn sums of corresponding residue similarity scores from a Table. The statistical models that test alignments for significance make it possible to group sequences objectively, even when most or all of the interrelationships are weak. These tests are very sensitive, while remaining quite conservative, and discourage the addition of "misfit" sequences to an existing set. The new techniques are applied to a set of five DNA-binding proteins, to a group of three enzymes that employ the coenzyme FAD, and to a control set. The alignment previously proposed for the DNA-binding proteins on the basis of structural comparisons and inspection of sequences is supported quite dramatically, and a highly significant alignment is found for the FAD-binding proteins.
AB - A method has been developed for aligning segments of several sequences at once. The number of search steps depends only polynomially on the number of sequences, instead of exponentially, because most alignments are rejected without being evaluated explicitly. A data structure herein called the "heap" facilitates this process. For a set of n sequence segments, the overall similarity is taken to be the sum of all the constituent segment pair similarities, which are in turn sums of corresponding residue similarity scores from a Table. The statistical models that test alignments for significance make it possible to group sequences objectively, even when most or all of the interrelationships are weak. These tests are very sensitive, while remaining quite conservative, and discourage the addition of "misfit" sequences to an existing set. The new techniques are applied to a set of five DNA-binding proteins, to a group of three enzymes that employ the coenzyme FAD, and to a control set. The alignment previously proposed for the DNA-binding proteins on the basis of structural comparisons and inspection of sequences is supported quite dramatically, and a highly significant alignment is found for the FAD-binding proteins.
UR - http://www.scopus.com/inward/record.url?scp=0022552744&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0022552744&partnerID=8YFLogxK
U2 - 10.1016/0022-2836(86)90252-4
DO - 10.1016/0022-2836(86)90252-4
M3 - Article
C2 - 3806669
AN - SCOPUS:0022552744
SN - 0022-2836
VL - 191
SP - 153
EP - 161
JO - Journal of Molecular Biology
JF - Journal of Molecular Biology
IS - 2
ER -