TY - JOUR
T1 - Data analysis and modeling pipelines for controlled networked social science experiments
AU - Cedeno-Mieles, Vanessa
AU - Hu, Zhihao
AU - Ren, Yihui
AU - Deng, Xinwei
AU - Contractor, Noshir
AU - Ekanayake, Saliya
AU - Epstein, Joshua M.
AU - Goode, Brian J.
AU - Korkmaz, Gizem
AU - Kuhlman, Chris J.
AU - Machi, Dustin
AU - Macy, Michael
AU - Marathe, Madhav V.
AU - Ramakrishnan, Naren
AU - Saraf, Parang
AU - Self, Nathan
N1 - Publisher Copyright:
© 2020 Cedeno-Mieles et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2020/11
Y1 - 2020/11
N2 - There is large interest in networked social science experiments for understanding human behavior at-scale. Significant effort is required to perform data analytics on experimental outputs and for computational modeling of custom experiments. Moreover, experiments and modeling are often performed in a cycle, enabling iterative experimental refinement and data modeling to uncover interesting insights and to generate/refute hypotheses about social behaviors. The current practice for social analysts is to develop tailor-made computer programs and analytical scripts for experiments and modeling. This often leads to inefficiencies and duplication of effort. In this work, we propose a pipeline framework to take a significant step towards overcoming these challenges. Our contribution is to describe the design and implementation of a software system to automate many of the steps involved in analyzing social science experimental data, building models to capture the behavior of human subjects, and providing data to test hypotheses. The proposed pipeline framework consists of formal models, formal algorithms, and theoretical models as the basis for the design and implementation. We propose a formal data model, such that if an experiment can be described in terms of this model, then our pipeline software can be used to analyze data efficiently. The merits of the proposed pipeline framework is elaborated by several case studies of networked social science experiments.
AB - There is large interest in networked social science experiments for understanding human behavior at-scale. Significant effort is required to perform data analytics on experimental outputs and for computational modeling of custom experiments. Moreover, experiments and modeling are often performed in a cycle, enabling iterative experimental refinement and data modeling to uncover interesting insights and to generate/refute hypotheses about social behaviors. The current practice for social analysts is to develop tailor-made computer programs and analytical scripts for experiments and modeling. This often leads to inefficiencies and duplication of effort. In this work, we propose a pipeline framework to take a significant step towards overcoming these challenges. Our contribution is to describe the design and implementation of a software system to automate many of the steps involved in analyzing social science experimental data, building models to capture the behavior of human subjects, and providing data to test hypotheses. The proposed pipeline framework consists of formal models, formal algorithms, and theoretical models as the basis for the design and implementation. We propose a formal data model, such that if an experiment can be described in terms of this model, then our pipeline software can be used to analyze data efficiently. The merits of the proposed pipeline framework is elaborated by several case studies of networked social science experiments.
UR - http://www.scopus.com/inward/record.url?scp=85096816984&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096816984&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0242453
DO - 10.1371/journal.pone.0242453
M3 - Article
C2 - 33232347
AN - SCOPUS:85096816984
SN - 1932-6203
VL - 15
JO - PLoS One
JF - PLoS One
IS - 11 November
M1 - e0242453
ER -