A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model

James Aspnes*, Julia Hartling, Kao Ming-Yang, Junhyong Kim, Gauri Shah

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In modern biology, one of the most important research problems is to understand how protein sequences fold into their native 3D structures. To investigate this problem at a high level, one wishes to analyze the protein landscapes, i.e., the structures of the space of all protein sequences and their native 3D structures. Perhaps the most basic computational problem at this level is to take a target 3D structure as input and design a fittest protein sequence with respect to one or more fitness functions of the target 3D structure. We develop a toolbox of combinatorial techniques for protein landscape analysis in the Grand Canonical model of Sun, Brem, Chan, and Dill. The toolbox is based on linear programming, network flow, and a linear-size representation of all minimum cuts of a network. It not only substantially expands the network flow technique for protein sequence design in Kleinberg's seminal work but also is applicable to a considerably broader collection of computational problems than those considered by Kleinberg. We have used this toolbox to obtain a number of efficient algorithms and hardness results. We have further used the algorithms to analyze 3D structures drawn from the Protein Data Bank and have discovered some novel relationships between such native 3D structures and the Grand Canonical model.

Original languageEnglish (US)
Title of host publicationAlgorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings
Pages403-415
Number of pages13
DOIs
StatePublished - Dec 1 2001
Event12th International Symposium on Algorithms and Computation, ISAAC 2001 - Christchurch, New Zealand
Duration: Dec 19 2001Dec 21 2001

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2223 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th International Symposium on Algorithms and Computation, ISAAC 2001
CountryNew Zealand
CityChristchurch
Period12/19/0112/21/01

Fingerprint

Canonical Model
Protein Sequence
Proteins
Network Flow
Protein
Minimum Cut
Target
Fitness Function
Design
Sun
Linear programming
Hardness
Expand
Biology
Fold
Efficient Algorithms

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Aspnes, J., Hartling, J., Ming-Yang, K., Kim, J., & Shah, G. (2001). A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model. In Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings (pp. 403-415). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2223 LNCS). https://doi.org/10.1007/3-540-45678-3_35
Aspnes, James ; Hartling, Julia ; Ming-Yang, Kao ; Kim, Junhyong ; Shah, Gauri. / A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model. Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings. 2001. pp. 403-415 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4c3013b104b64471b7427199bd913088,
title = "A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model",
abstract = "In modern biology, one of the most important research problems is to understand how protein sequences fold into their native 3D structures. To investigate this problem at a high level, one wishes to analyze the protein landscapes, i.e., the structures of the space of all protein sequences and their native 3D structures. Perhaps the most basic computational problem at this level is to take a target 3D structure as input and design a fittest protein sequence with respect to one or more fitness functions of the target 3D structure. We develop a toolbox of combinatorial techniques for protein landscape analysis in the Grand Canonical model of Sun, Brem, Chan, and Dill. The toolbox is based on linear programming, network flow, and a linear-size representation of all minimum cuts of a network. It not only substantially expands the network flow technique for protein sequence design in Kleinberg's seminal work but also is applicable to a considerably broader collection of computational problems than those considered by Kleinberg. We have used this toolbox to obtain a number of efficient algorithms and hardness results. We have further used the algorithms to analyze 3D structures drawn from the Protein Data Bank and have discovered some novel relationships between such native 3D structures and the Grand Canonical model.",
author = "James Aspnes and Julia Hartling and Kao Ming-Yang and Junhyong Kim and Gauri Shah",
year = "2001",
month = "12",
day = "1",
doi = "10.1007/3-540-45678-3_35",
language = "English (US)",
isbn = "3540429859",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "403--415",
booktitle = "Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings",

}

Aspnes, J, Hartling, J, Ming-Yang, K, Kim, J & Shah, G 2001, A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model. in Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2223 LNCS, pp. 403-415, 12th International Symposium on Algorithms and Computation, ISAAC 2001, Christchurch, New Zealand, 12/19/01. https://doi.org/10.1007/3-540-45678-3_35

A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model. / Aspnes, James; Hartling, Julia; Ming-Yang, Kao; Kim, Junhyong; Shah, Gauri.

Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings. 2001. p. 403-415 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2223 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model

AU - Aspnes, James

AU - Hartling, Julia

AU - Ming-Yang, Kao

AU - Kim, Junhyong

AU - Shah, Gauri

PY - 2001/12/1

Y1 - 2001/12/1

N2 - In modern biology, one of the most important research problems is to understand how protein sequences fold into their native 3D structures. To investigate this problem at a high level, one wishes to analyze the protein landscapes, i.e., the structures of the space of all protein sequences and their native 3D structures. Perhaps the most basic computational problem at this level is to take a target 3D structure as input and design a fittest protein sequence with respect to one or more fitness functions of the target 3D structure. We develop a toolbox of combinatorial techniques for protein landscape analysis in the Grand Canonical model of Sun, Brem, Chan, and Dill. The toolbox is based on linear programming, network flow, and a linear-size representation of all minimum cuts of a network. It not only substantially expands the network flow technique for protein sequence design in Kleinberg's seminal work but also is applicable to a considerably broader collection of computational problems than those considered by Kleinberg. We have used this toolbox to obtain a number of efficient algorithms and hardness results. We have further used the algorithms to analyze 3D structures drawn from the Protein Data Bank and have discovered some novel relationships between such native 3D structures and the Grand Canonical model.

AB - In modern biology, one of the most important research problems is to understand how protein sequences fold into their native 3D structures. To investigate this problem at a high level, one wishes to analyze the protein landscapes, i.e., the structures of the space of all protein sequences and their native 3D structures. Perhaps the most basic computational problem at this level is to take a target 3D structure as input and design a fittest protein sequence with respect to one or more fitness functions of the target 3D structure. We develop a toolbox of combinatorial techniques for protein landscape analysis in the Grand Canonical model of Sun, Brem, Chan, and Dill. The toolbox is based on linear programming, network flow, and a linear-size representation of all minimum cuts of a network. It not only substantially expands the network flow technique for protein sequence design in Kleinberg's seminal work but also is applicable to a considerably broader collection of computational problems than those considered by Kleinberg. We have used this toolbox to obtain a number of efficient algorithms and hardness results. We have further used the algorithms to analyze 3D structures drawn from the Protein Data Bank and have discovered some novel relationships between such native 3D structures and the Grand Canonical model.

UR - http://www.scopus.com/inward/record.url?scp=71049129500&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=71049129500&partnerID=8YFLogxK

U2 - 10.1007/3-540-45678-3_35

DO - 10.1007/3-540-45678-3_35

M3 - Conference contribution

AN - SCOPUS:71049129500

SN - 3540429859

SN - 9783540429852

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 403

EP - 415

BT - Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings

ER -

Aspnes J, Hartling J, Ming-Yang K, Kim J, Shah G. A combinatorial toolbox for protein sequence design and landscape analysis in the grand canonical model. In Algorithms and Computation - 12th International Symposium, ISAAC 2001, Proceedings. 2001. p. 403-415. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/3-540-45678-3_35