TY - JOUR
T1 - Machine Learning on Signal-to-Noise Ratios Improves Peptide Array Design in SAMDI Mass Spectrometry
AU - Xue, Albert Y.
AU - Szymczak, Lindsey C.
AU - Mrksich, Milan
AU - Bagheri, Neda
N1 - Publisher Copyright:
© 2017 American Chemical Society.
PY - 2017/9/5
Y1 - 2017/9/5
N2 - Emerging peptide array technologies are able to profile molecular activities within cell lysates. However, the structural diversity of peptides leads to inherent differences in peptide signal-to-noise ratios (S/N). These complex effects can lead to potentially unrepresentative signal intensities and can bias subsequent analyses. Within mass spectrometry-based peptide technologies, the relation between a peptide's amino acid sequence and S/N remains largely nonquantitative. To address this challenge, we present a method to quantify and analyze mass spectrometry S/N of two peptide arrays, and we use this analysis to portray quality of data and to design future arrays for SAMDI mass spectrometry. Our study demonstrates that S/N varies significantly across peptides within peptide arrays, and variation in S/N is attributable to differences of single amino acids. We apply supervised machine learning to predict peptide S/N based on amino acid sequence, and identify specific physical properties of the amino acids that govern variation of this metric. We find low peptide-S/N concordance between arrays, demonstrating that different arrays require individual characterization and that global peptide-S/N relationships are difficult to identify. However, with proper peptide sampling, this study illustrates how machine learning can accurately predict the S/N of a peptide in an array, allowing for the efficient design of arrays through selection of high S/N peptides.
AB - Emerging peptide array technologies are able to profile molecular activities within cell lysates. However, the structural diversity of peptides leads to inherent differences in peptide signal-to-noise ratios (S/N). These complex effects can lead to potentially unrepresentative signal intensities and can bias subsequent analyses. Within mass spectrometry-based peptide technologies, the relation between a peptide's amino acid sequence and S/N remains largely nonquantitative. To address this challenge, we present a method to quantify and analyze mass spectrometry S/N of two peptide arrays, and we use this analysis to portray quality of data and to design future arrays for SAMDI mass spectrometry. Our study demonstrates that S/N varies significantly across peptides within peptide arrays, and variation in S/N is attributable to differences of single amino acids. We apply supervised machine learning to predict peptide S/N based on amino acid sequence, and identify specific physical properties of the amino acids that govern variation of this metric. We find low peptide-S/N concordance between arrays, demonstrating that different arrays require individual characterization and that global peptide-S/N relationships are difficult to identify. However, with proper peptide sampling, this study illustrates how machine learning can accurately predict the S/N of a peptide in an array, allowing for the efficient design of arrays through selection of high S/N peptides.
UR - http://www.scopus.com/inward/record.url?scp=85028939676&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85028939676&partnerID=8YFLogxK
U2 - 10.1021/acs.analchem.7b01728
DO - 10.1021/acs.analchem.7b01728
M3 - Article
C2 - 28719743
AN - SCOPUS:85028939676
SN - 0003-2700
VL - 89
SP - 9039
EP - 9047
JO - Analytical Chemistry
JF - Analytical Chemistry
IS - 17
ER -