TY - JOUR
T1 - That Takes the BISCUIT
T2 - Predictive Accuracy and Parsimony of Four Statistical Learning Techniques in Personality Data, with Data Missingness Conditions
AU - Elleman, Lorien G.
AU - McDougald, Sarah K.
AU - Condon, David M.
AU - Revelle, William
N1 - Publisher Copyright:
© 2020 Hogrefe Publishing GmbH. All rights reserved.
PY - 2020/11
Y1 - 2020/11
N2 - The predictive accuracy of personality-criterion regression models may be improved with statistical learning (SL) techniques. This study introduced a novel SL technique, BISCUIT (Best Items Scale that is Cross-validated, Unit-weighted, Informative, and Transparent). The predictive accuracy and parsimony of BISCUIT were compared with three established SL techniques (the lasso, elastic net, and random forest) and regression using two sets of scales, for five criteria, across five levels of data missingness. BISCUIT’s predictive accuracy was competitive with other SL techniques at higher levels of data missingness. BISCUIT most frequently produced the most parsimonious SL model. In terms of predictive accuracy, the elastic net and lasso dominated other techniques in the complete data condition and in conditions with up to 50% data missingness. Regression using 27 narrow traits was an intermediate choice for predictive accuracy. For most criteria and levels of data missingness, regression using the Big Five had the worst predictive accuracy. Overall, loss in predictive accuracy due to data missingness was modest, even at 90% data missingness. Findings suggest that personality researchers should consider incorporating planned data missingness and SL techniques into their designs and analyses.
AB - The predictive accuracy of personality-criterion regression models may be improved with statistical learning (SL) techniques. This study introduced a novel SL technique, BISCUIT (Best Items Scale that is Cross-validated, Unit-weighted, Informative, and Transparent). The predictive accuracy and parsimony of BISCUIT were compared with three established SL techniques (the lasso, elastic net, and random forest) and regression using two sets of scales, for five criteria, across five levels of data missingness. BISCUIT’s predictive accuracy was competitive with other SL techniques at higher levels of data missingness. BISCUIT most frequently produced the most parsimonious SL model. In terms of predictive accuracy, the elastic net and lasso dominated other techniques in the complete data condition and in conditions with up to 50% data missingness. Regression using 27 narrow traits was an intermediate choice for predictive accuracy. For most criteria and levels of data missingness, regression using the Big Five had the worst predictive accuracy. Overall, loss in predictive accuracy due to data missingness was modest, even at 90% data missingness. Findings suggest that personality researchers should consider incorporating planned data missingness and SL techniques into their designs and analyses.
KW - Big Five
KW - machine learning
KW - nuances
KW - personality
KW - statistical learning
UR - http://www.scopus.com/inward/record.url?scp=85081227322&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081227322&partnerID=8YFLogxK
U2 - 10.1027/1015-5759/a000590
DO - 10.1027/1015-5759/a000590
M3 - Article
AN - SCOPUS:85081227322
SN - 1015-5759
VL - 36
SP - 948
EP - 958
JO - European Journal of Psychological Assessment
JF - European Journal of Psychological Assessment
IS - 6
ER -