Accelerated enzyme engineering by machine-learning guided cell-free expression

Grant M. Landwehr, Jonathan W. Bogart, Carol Magalhaes, Eric G. Hammarlund, Ashty S. Karim*, Michael Christopher Jewett*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

30 Scopus citations

Abstract

Enzyme engineering is limited by the challenge of rapidly generating and using large datasets of sequence-function relationships for predictive design. To address this challenge, we develop a machine learning (ML)-guided platform that integrates cell-free DNA assembly, cell-free gene expression, and functional assays to rapidly map fitness landscapes across protein sequence space and optimize enzymes for multiple, distinct chemical reactions. We apply this platform to engineer amide synthetases by evaluating substrate preference for 1217 enzyme variants in 10,953 unique reactions. We use these data to build augmented ridge regression ML models for predicting amide synthetase variants capable of making 9 small molecule pharmaceuticals. Over these nine compounds, ML-predicted enzyme variants demonstrate 1.6- to 42-fold improved activity relative to the parent. Our ML-guided, cell-free framework promises to accelerate enzyme engineering by enabling iterative exploration of protein sequence space to build specialized biocatalysts in parallel.

Original languageEnglish (US)
Article number865
JournalNature communications
Volume16
Issue number1
DOIs
StatePublished - Dec 2025

Funding

We thank Kosuke Seki, Andrew C. Hunt, and Steve R. Fleming for conversations regarding this work. We acknowledge the use of the Keck Biophysics Facility, a shared resource of the Robert H. Lurie Comprehensive Cancer Center of Northwestern University supported in part by the NCI Cancer Center Support Grant #P30 CA060553. In addition, we acknowledge the use of the computational resources and staff contributions provided for the Quest high performance computing facility at Northwestern University which is jointly supported by the Office of the Provost, the Office for Research, and Northwestern University Information Technology. This work also made use of the IMSERC NMR facility at Northwestern University, which has received support from the Soft and Hybrid Nanotechnology Experimental (SHyNE) Resource (NSF ECCS-2025633), Int. Institute of Nanotechnology, and Northwestern University. Molecular graphics and analyses performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from National Institutes of Health R01-GM129325 and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases. M.C.J. acknowledges support from the Department of Energy Grant DE-SC0023278, the Defense Threat Reduction Agency Grant DTRA1-21-1-0038, the National Institutes of Health Grant 1U19AI142780-01, and the LDRD Program at Sandia National Laboratories Grant DE-NA0003525.

ASJC Scopus subject areas

  • General Chemistry
  • General Biochemistry, Genetics and Molecular Biology
  • General Physics and Astronomy

Fingerprint

Dive into the research topics of 'Accelerated enzyme engineering by machine-learning guided cell-free expression'. Together they form a unique fingerprint.

Cite this