Parsing 20 Years of Public Data by AI Maps Trends in Proteomics and Forecasts Technology

Josiah J. Green, Chase Grimm, Andre Fristo, Joseph Byrum*, Neil L. Kelleher*

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review


The trends of the last 20 years in biotechnology were revealed using artificial intelligence and natural language processing (NLP) of publicly available data. Implementing this “science-of-science” approach, we capture convergent trends in the field of proteomics in both technology development and application across the phylogenetic tree of life. With major gaps in our knowledge about protein composition, structure, and location over time, we report trends in persistent, popular approaches and emerging technologies across 94 ideas from a corpus of 29 journals in PubMed over two decades. New metrics for clusters of these ideas reveal the progression and popularity of emerging approaches like single-cell, spatial, compositional, and chemical proteomics designed to better capture protein-level chemistry and biology. This analysis of the proteomics literature with advanced analytic tools quantifies the Rate of Rise for a next generation of technologies to better define, quantify, and visualize the multiple dimensions of the proteome that will transform our ability to measure and understand proteins in the coming decade.

Original languageEnglish (US)
Pages (from-to)523-531
Number of pages9
JournalJournal of Proteome Research
Issue number2
StatePublished - Feb 2 2024


  • artificial intelligence
  • chemical proteomics
  • consilience (convergent evidence)
  • data mining
  • genomics
  • literature trends
  • natural language processing
  • prediction
  • proteins
  • proteoforms
  • proteomics
  • single molecule protein sequencing
  • single-cell biology
  • spatial biology
  • structural proteomics

ASJC Scopus subject areas

  • General Chemistry
  • Biochemistry


Dive into the research topics of 'Parsing 20 Years of Public Data by AI Maps Trends in Proteomics and Forecasts Technology'. Together they form a unique fingerprint.

Cite this