Safe Bets and Risky Propositions: Leveraging Rich Data to Understand Scientific Diversity, Impact, and Potential of Teams

Project: Research project

Project Details


Narrative (e.g., Charney, 2003) as well quantitative accounts (Uzzi & Spiro, 2005) of the sources for high-impact scientific breakthroughs reveal that today’s most pressing scientific problems present a degree of complexity that necessitates scientific teamwork (Barabási, 2005; Wuchty, Jones, & Uzzi, 2007). Studies of scientific teamwork have typically focused on drivers of team effectiveness that operate after a team has assembled (e.g., National Research Council, 2015), thus overlooking the factors that affect scientists’ decisions to join teams in the first place. The few that do the latter (including some of our work), have focused on structural predictors (prior coauthors or citation) rather than semantic predictors (content and expertise in prior work). In this project, we address this theoretical and empirical gap. Specifically, we investigate (1) the structural and semantic factors that predict scientific team assembly and (2) which of these assembly factors predict scientific team performance. Doing so enables us to address secondary, but equally important questions about the performance of scientific teams, such as: why some repeat collaborations are successful while others are not.

We use a multi-theoretical multi-level approach that considers individual (gender, experience, expertise, etc.), relational (prior collaboration, citation, etc.) and team-interlock ecosystem factors. Methodologically, we apply a multi-method approach to understand team assembly and performance, that entails (1) topic modeling to reveal researcher’s expertise based on their publication text, (2) latent semantic analysis to reveal the backward and forward citation distance between publications and (3) a novel hypergraph approach to understand how the structure of the ecosystem of team interlocking membership and interlocking knowledge (Lungeanu et al 2018 . We leverage as data sources local updated copies of the Web of Science and the XML version of the full text of articles in the ScienceDirect database provided for research purposes to us at Northwestern by the vendors.
Effective start/end date8/15/197/31/22


  • National Science Foundation (SMA-1856090)


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.