TY - JOUR

T1 - Bias Reduction in Quasi-Experiments With Little Selection Theory but Many Covariates

AU - Steiner, Peter M.

AU - Cook, Thomas D.

AU - Li, Wei

AU - Clark, M. H.

N1 - Publisher Copyright:
© 2015, Copyright © Taylor & Francis Group, LLC.

PY - 2015/10/2

Y1 - 2015/10/2

N2 - Abstract: In observational studies, selection bias will be completely removed only if the selection mechanism is ignorable, namely, all confounders of treatment selection and potential outcomes are reliably measured. Ideally, well-grounded substantive theories about the selection process and outcome-generating model are used to generate the sample of covariates. However, covariate selection is more heuristic in actual practice. Using two empirical data sets in a simulation study, we investigate four research questions about bias reduction when the selection mechanism is not known but many covariates are measured: (1) How important is the conceptual heterogeneity of the covariate domains in the data set? (2) How important is the number of covariates assessing each domain? (3) What are the joint effects of this conceptual heterogeneity and of the number of covariates per domain? (4) What happens to bias reduction when the set of covariates is deliberately impoverished by removing the covariates most responsible for selection bias, thus ensuring a slightly smaller but still heterogeneous set of covariates? The results indicate: (1) increasingly more bias is reduced as the number of covariate domains and the number of covariates per domain increase, though the rate of bias reduction is diminishing in each case; (2) sampling covariates from multiple heterogeneous covariate domains is more important than choosing many measures from fewer domains; (3) the most heterogeneous set of covariate domains removes almost all of the selection bias when at least five covariates are assessed in each domain; and (4) omitting the most crucial covariates generally replicates the pattern of results due to the number of domains and the number of covariates per domain, but the amount of bias reduction is less than when all variables are included and will surely not satisfy all consumers of causal research.

AB - Abstract: In observational studies, selection bias will be completely removed only if the selection mechanism is ignorable, namely, all confounders of treatment selection and potential outcomes are reliably measured. Ideally, well-grounded substantive theories about the selection process and outcome-generating model are used to generate the sample of covariates. However, covariate selection is more heuristic in actual practice. Using two empirical data sets in a simulation study, we investigate four research questions about bias reduction when the selection mechanism is not known but many covariates are measured: (1) How important is the conceptual heterogeneity of the covariate domains in the data set? (2) How important is the number of covariates assessing each domain? (3) What are the joint effects of this conceptual heterogeneity and of the number of covariates per domain? (4) What happens to bias reduction when the set of covariates is deliberately impoverished by removing the covariates most responsible for selection bias, thus ensuring a slightly smaller but still heterogeneous set of covariates? The results indicate: (1) increasingly more bias is reduced as the number of covariate domains and the number of covariates per domain increase, though the rate of bias reduction is diminishing in each case; (2) sampling covariates from multiple heterogeneous covariate domains is more important than choosing many measures from fewer domains; (3) the most heterogeneous set of covariate domains removes almost all of the selection bias when at least five covariates are assessed in each domain; and (4) omitting the most crucial covariates generally replicates the pattern of results due to the number of domains and the number of covariates per domain, but the amount of bias reduction is less than when all variables are included and will surely not satisfy all consumers of causal research.

KW - Observational study, causal inference, propensity score, covariate selection

UR - http://www.scopus.com/inward/record.url?scp=84945137860&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84945137860&partnerID=8YFLogxK

U2 - 10.1080/19345747.2014.978058

DO - 10.1080/19345747.2014.978058

M3 - Article

AN - SCOPUS:84945137860

VL - 8

SP - 552

EP - 576

JO - Journal of Research on Educational Effectiveness

JF - Journal of Research on Educational Effectiveness

SN - 1934-5747

IS - 4

ER -