TY - JOUR
T1 - A dataset of publication records for Nobel laureates
AU - Li, Jichao
AU - Yin, Yian
AU - Fortunato, Santo
AU - Wang, Dashun
N1 - Funding Information:
The authors thank L. Liu, Y. Wang, Y. Ma, B. Uzzi, and all members of Northwestern Institute on Complex Systems (NICO) for invaluable comments. This work is supported by the Air Force Office of Scientific Research under award number FA9550-15-1-0162 and FA9550-17-1-0089, National Science Foundation grant SBE 1829344 and Northwestern University’s Data Science Initiative.
Publisher Copyright:
© 2019, The Author(s).
PY - 2019/12/1
Y1 - 2019/12/1
N2 - A central question in the science of science concerns how to develop a quantitative understanding of the evolution and impact of individual careers. Over the course of history, a relatively small fraction of individuals have made disproportionate, profound, and lasting impacts on science and society. Despite a long-standing interest in the careers of scientific elites across diverse disciplines, it remains difficult to collect large-scale career histories that could serve as training sets for systematic empirical and theoretical studies. Here, by combining unstructured data collected from CVs, university websites, and Wikipedia, together with the publication and citation database from Microsoft Academic Graph (MAG), we reconstructed publication histories of nearly all Nobel prize winners from the past century, through both manual curation and algorithmic disambiguation procedures. Data validation shows that the collected dataset presents among the most comprehensive collection of publication records for Nobel laureates currently available. As our quantitative understanding of science deepens, this dataset is expected to have increasing value. It will not only allow us to quantitatively probe novel patterns of productivity, collaboration, and impact governing successful scientific careers, it may also help us unearth the fundamental principles underlying creativity and the genesis of scientific breakthroughs.
AB - A central question in the science of science concerns how to develop a quantitative understanding of the evolution and impact of individual careers. Over the course of history, a relatively small fraction of individuals have made disproportionate, profound, and lasting impacts on science and society. Despite a long-standing interest in the careers of scientific elites across diverse disciplines, it remains difficult to collect large-scale career histories that could serve as training sets for systematic empirical and theoretical studies. Here, by combining unstructured data collected from CVs, university websites, and Wikipedia, together with the publication and citation database from Microsoft Academic Graph (MAG), we reconstructed publication histories of nearly all Nobel prize winners from the past century, through both manual curation and algorithmic disambiguation procedures. Data validation shows that the collected dataset presents among the most comprehensive collection of publication records for Nobel laureates currently available. As our quantitative understanding of science deepens, this dataset is expected to have increasing value. It will not only allow us to quantitatively probe novel patterns of productivity, collaboration, and impact governing successful scientific careers, it may also help us unearth the fundamental principles underlying creativity and the genesis of scientific breakthroughs.
UR - http://www.scopus.com/inward/record.url?scp=85065021203&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065021203&partnerID=8YFLogxK
U2 - 10.1038/s41597-019-0033-6
DO - 10.1038/s41597-019-0033-6
M3 - Article
C2 - 31000709
AN - SCOPUS:85065021203
VL - 6
JO - Scientific data
JF - Scientific data
SN - 2052-4463
IS - 1
M1 - 33
ER -