Abstract
The trends of the last 20 years in biotechnology were revealed using artificial intelligence and natural language processing (NLP) of publicly available data. Implementing this “science-of-science” approach, we capture convergent trends in the field of proteomics in both technology development and application across the phylogenetic tree of life. With major gaps in our knowledge about protein composition, structure, and location over time, we report trends in persistent, popular approaches and emerging technologies across 94 ideas from a corpus of 29 journals in PubMed over two decades. New metrics for clusters of these ideas reveal the progression and popularity of emerging approaches like single-cell, spatial, compositional, and chemical proteomics designed to better capture protein-level chemistry and biology. This analysis of the proteomics literature with advanced analytic tools quantifies the Rate of Rise for a next generation of technologies to better define, quantify, and visualize the multiple dimensions of the proteome that will transform our ability to measure and understand proteins in the coming decade.
Original language | English (US) |
---|---|
Pages (from-to) | 523-531 |
Number of pages | 9 |
Journal | Journal of Proteome Research |
Volume | 23 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2 2024 |
Funding
The authors thank Craig Crews for helpful discussions, and N.L.K. acknowledges the National Institutes of Health for supporting a Biotechnology Development and Dissemination Center under grant number P41 GM108569 and for financial support of grant number UH3 CA246635.
Keywords
- artificial intelligence
- chemical proteomics
- consilience (convergent evidence)
- data mining
- genomics
- literature trends
- natural language processing
- prediction
- proteins
- proteoforms
- proteomics
- single molecule protein sequencing
- single-cell biology
- spatial biology
- structural proteomics
ASJC Scopus subject areas
- General Chemistry
- Biochemistry