TY - GEN
T1 - A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model
AU - Aspnes, James
AU - Hartling, Julia
AU - Ming-Yang, Kao
AU - Kim, Junhyong
AU - Shah, Gauri
PY - 2001
Y1 - 2001
N2 - In modern biology, one of the most important research problems is to understand how protein sequences fold into their native 3D structures. To investigate this problem at a high level, one wishes to analyze the protein landscapes, i.e., the structures of the space of all protein sequences and their native 3D structures. Perhaps the most basic computational problem at this level is to take a target 3D structure as input and design a fittest protein sequence with respect to one or more fitness functions of the target 3D structure. We develop a toolbox of combinatorial techniques for protein landscape analysis in the Grand Canonical model of Sun, Brem, Chan, and Dill. The toolbox is based on linear programming, network flow, and a linear-size representation of all minimum cuts of a network. It not only substantially expands the network flow technique for protein sequence design in Kleinberg's seminal work but also is applicable to a considerably broader collection of computational problems than those considered by Kleinberg. We have used this toolbox to obtain a number of efficient algorithms and hardness results. We have further used the algorithms to analyze 3D structures drawn from the Protein Data Bank and have discovered some novel relationships between such native 3D structures and the Grand Canonical model.
AB - In modern biology, one of the most important research problems is to understand how protein sequences fold into their native 3D structures. To investigate this problem at a high level, one wishes to analyze the protein landscapes, i.e., the structures of the space of all protein sequences and their native 3D structures. Perhaps the most basic computational problem at this level is to take a target 3D structure as input and design a fittest protein sequence with respect to one or more fitness functions of the target 3D structure. We develop a toolbox of combinatorial techniques for protein landscape analysis in the Grand Canonical model of Sun, Brem, Chan, and Dill. The toolbox is based on linear programming, network flow, and a linear-size representation of all minimum cuts of a network. It not only substantially expands the network flow technique for protein sequence design in Kleinberg's seminal work but also is applicable to a considerably broader collection of computational problems than those considered by Kleinberg. We have used this toolbox to obtain a number of efficient algorithms and hardness results. We have further used the algorithms to analyze 3D structures drawn from the Protein Data Bank and have discovered some novel relationships between such native 3D structures and the Grand Canonical model.
UR - http://www.scopus.com/inward/record.url?scp=71049129500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=71049129500&partnerID=8YFLogxK
U2 - 10.1007/3-540-45678-3_35
DO - 10.1007/3-540-45678-3_35
M3 - Conference contribution
AN - SCOPUS:71049129500
SN - 3540429859
SN - 9783540429852
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 403
EP - 415
BT - Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings
T2 - 12th International Symposium on Algorithms and Computation, ISAAC 2001
Y2 - 19 December 2001 through 21 December 2001
ER -