Introduction: If not handled appropriately, missing data can result in biased estimates and, quite possibly, incorrect conclusions about treatment efficacy. This article aimed to demonstrate how ordinary use of generalized estimating equations (GEE) can be problematic if the assumption of missing completely at random (MCAR) is not met. Methods: We tested whether results differed for different analytic methods depending on whether the MCAR assumption was violated. This example used data from a published randomized controlled trial examining whether varying the timing of a weight management intervention, in concert with smoking cessation, improved cessation rates for adult female smokers. Participants were 284 women with at least one report of smoking status during Visits 4-16. Smoking status was assessed at each visit via self-report and biologically verified using expired carbon monoxide. Results: Results showed that while the GEE analysis found differences in smoking status between conditions, tests of the MCAR assumption demonstrated that it was not valid for this dataset. Additional analyses using tests that do not require the MCAR assumption found no differences between conditions. Thus, GEE is not an appropriate choice for this analysis. Discussion: While GEE is an appropriate technique for analyzing dichotomous data when the MCAR assumption is not violated, weighted GEE or mixed-effects logistic regression are more appropriate when the missing data mechanism is not MCAR.
ASJC Scopus subject areas
- Public Health, Environmental and Occupational Health