Abstract
Peptide materials have a wide array of functions, from tissue engineering and surface coatings to catalysis and sensing. Tuning the sequence of amino acids that comprise the peptide modulates peptide functionality, but a small increase in sequence length leads to a dramatic increase in the number of peptide candidates. Traditionally, peptide design is guided by human expertise and intuition and typically yields fewer than ten peptides per study, but these approaches are not easily scalable and are susceptible to human bias. Here we introduce a machine learning workflow—AI-expert—that combines Monte Carlo tree search and random forest with molecular dynamics simulations to develop a fully autonomous computational search engine to discover peptide sequences with high potential for self-assembly. We demonstrate the efficacy of the AI-expert to efficiently search large spaces of tripeptides and pentapeptides. The predictability of AI-expert performs on par or better than our human experts and suggests several non-intuitive sequences with high self-assembly propensity, outlining its potential to overcome human bias and accelerate peptide discovery. [Figure not available: see fulltext.]
Original language | English (US) |
---|---|
Pages (from-to) | 1427-1435 |
Number of pages | 9 |
Journal | Nature chemistry |
Volume | 14 |
Issue number | 12 |
DOIs | |
State | Published - Dec 2022 |
Funding
Work performed at the Center for Nanoscale Materials, a US Department of Energy (DOE) Office of Science User Facility, was supported by the US DOE, Office of Basic Energy Sciences, under contract no. DE-AC02-06CH11357, and additionally supported by the University of Chicago and the DOE under DOE contract no. DE-AC02-06CH11357 awarded to UChicago Argonne, LLC, operator of the Argonne National Laboratory. This material is based on work supported by the DOE, Office of Science, BES Data, Artificial Intelligence and Machine Learning at DOE Scientific User Facilities programme (Digital Twins). We gratefully acknowledge the computing resources provided on Bebop, the high-performance computing clusters operated by the Laboratory Computing Resource Center (LCRC) at Argonne National Laboratory. S.K.R.S.S. acknowledges support from the UIC faculty start-up fund. We acknowledge T. Tuttle for sharing computational data on tripeptides.
ASJC Scopus subject areas
- General Chemistry
- General Chemical Engineering