TY - JOUR
T1 - Machine learning for phone-based relationship estimation
T2 - The need to consider population heterogeneity
AU - Liu, Tony
AU - Nicholas, Jennifer
AU - Theilig, Max M.
AU - Guntuku, Sharath C.
AU - Kording, Konrad
AU - Mohr, David C.
AU - Ungar, Lyle
N1 - Publisher Copyright:
Copyright © 2019 held by the owner/author(s).
PY - 2019/12
Y1 - 2019/12
N2 - Estimating the category and quality of interpersonal relationships from ubiquitous phone sensor data matters for studying mental well-being and social support. Prior work focused on using communication volume to estimate broad relationship categories, often with small samples. Here we contextualize communications by combining phone logs with demographic and location data to predict interpersonal relationship roles on a varied sample population using automated machine learning methods, producing better performance (F 1 = 0.68) than using communication features alone (F 1 = 0.62). We also explore the effect of age variation in the underlying training sample on interpersonal relationship prediction and find that models trained on younger subgroups, which is popular in the field via student participation and recruitment, generalize poorly to the wider population. Our results not only illustrate the value of using data across demographics, communication patterns and semantic locations for relationship prediction, but also underscore the importance of considering population heterogeneity in phone-based personal sensing studies.
AB - Estimating the category and quality of interpersonal relationships from ubiquitous phone sensor data matters for studying mental well-being and social support. Prior work focused on using communication volume to estimate broad relationship categories, often with small samples. Here we contextualize communications by combining phone logs with demographic and location data to predict interpersonal relationship roles on a varied sample population using automated machine learning methods, producing better performance (F 1 = 0.68) than using communication features alone (F 1 = 0.62). We also explore the effect of age variation in the underlying training sample on interpersonal relationship prediction and find that models trained on younger subgroups, which is popular in the field via student participation and recruitment, generalize poorly to the wider population. Our results not only illustrate the value of using data across demographics, communication patterns and semantic locations for relationship prediction, but also underscore the importance of considering population heterogeneity in phone-based personal sensing studies.
KW - Automated machine learning
KW - Population heterogeneity
KW - Semantic location-based features
KW - Social relationship prediction
UR - http://www.scopus.com/inward/record.url?scp=85089761391&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089761391&partnerID=8YFLogxK
U2 - 10.1145/3369820
DO - 10.1145/3369820
M3 - Article
C2 - 32490330
AN - SCOPUS:85089761391
SN - 2474-9567
VL - 3
JO - Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
JF - Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
IS - 4
M1 - 145
ER -