The Rocklin Lab at Northwestern will support this proposal in two areas. 1. De novo peptide sequencing by liquid chromatography and tandem mass spectrometry (Aim 3.2, 3.3) The Chloroalkane Penetration Screen (CAPS) assay described in Aim 3 enables active (highly cytosol-penetrant) peptides to be separated away from inactive peptides in a large screening library. However, because the assay is conducted in a pooled format, the active peptides remain mixed with each other. After receiving a mixture of active peptides isolated in a CAPS experiment in the Kritzer lab, the Rocklin lab will identify the active peptides using de novo peptide sequencing by mass spectrometry. First, the proteomics core at Northwestern will perform liquid chromatography and tandem mass spectrometry (LC-MS/MS) analysis of the sample using a Thermo Q Exactive HF mass spectrometer. My lab will then analyze the LC-MS/MS data using the PEAKS software to determine the sequences of the active peptides. We will then report these data back to the Kritzer lab. Rocklin Lab postdoctoral fellow Dr. Jane Thibeault will perform this analysis. Dr. Thibeault is an expert in proteome-scale de novo peptide sequencing. During her first postdoctoral appointment at the Institute for Systems Biolgy, Dr. Thibeault authored the manuscript "Identification of Species with Unknown Genomes through Enhanced Fragmentation and Spectral Quality for De Novo Searching Approaches" (expected submission Spring 2021). Dr. Thibeault will train future members of the Rocklin lab in this analysis over the course of the project. 2. Machine learning analysis of active and inactive peptides and anti-sense oligonucleotides to enable data-driven design (Aim 3.5) The Chloroalkane Penetration Screen (CAPS) will provide more than 100,000 measurements of cytosol-penetrating activity for different molecule classes. The Rocklin lab will analyze these data to develop predictive models of cytosol-penetrating activity. First, in collaboration with the Kritzer lab, we will develop a set of physical and chemical descriptors for the screening sets of peptides and anti-sense oligonucleotides. Preliminary, these descriptors will include molecular size, hydrophobicity, net charge, base or amino acid composition, predominance of short sequence motifs, and potential for intramolecular interactions such as salt bridges or base pairing. Second, the Rocklin Lab will then train supervised machine learning models to predict the experimentally measured cytosol-penetrating activity based on these descriptors. The Rocklin lab regularly uses supervised machine learning (including linear/logistic regression models, random forest models, and Gaussian process models) for a range of problems in molecular design, including optimizing protein folding stability, binding affinity, expression level, and aggregation propensity. Our previous work (Rocklin et al., Science 2017) demonstrated how these models could serve as powerful tools for optimizing designed protein stability, when trained using datasets of several thousand folding stability measurements. Finally, we will collaborate with the Kritzer lab in performing iterative cycles of molecular design, high-throughput testing using CAPS, and machine learning to optimize the determinants of delivery. In each cycle, we will use the previous high-throughput data to and predictive models to (1) design new libraries with even higher predicted cytosol-penetrating activity, and (2) design new libraries that specifically explore the chemical space where the current-generation mo
|Effective start/end date||3/1/22 → 2/28/26|
- Tufts University (102184-00001//2R01GM127585-05)
- National Institute of General Medical Sciences (102184-00001//2R01GM127585-05)
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.