Strategies for imputing missing covariates in accelerated failure time models

Lihong Qi*, Ying Fang Wang, Rongqi Chen, Juned Siddique, John Robbins, Yulei He

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Missing covariates often occur in biomedical studies with survival outcomes. Multiple imputation via chained equations (MICE) is a semi-parametric and flexible approach that imputes multivariate data by a series of conditional models, one for each incomplete variable. When applying MICE, practitioners tend to specify the conditional models in a simple fashion largely dictated by the software, which could lead to suboptimal results. Practical guidelines for specifying appropriate conditional models in MICE are lacking. Motivated by a study of time to hip fractures in the Women's Health Initiative Observational Study using accelerated failure time models, we propose and experiment with some rationales leading to appropriate MICE specifications. This strategy starts with specifying a joint model for the variables involved. We first derive the conditional distribution of each variable under the joint model, then approximate these conditional distributions to the extent which can be characterized by commonly used regression models. We propose to fit separate models to impute incomplete variables by the failure status, which is key to generating appropriate MICE specifications for survival outcomes. The proposed strategy can be conveniently implemented with all available imputation software that uses fully conditional specifications. Our simulation results show that some commonly used simple MICE specifications can produce suboptimal results, while those based on the proposed strategy appear to perform well and be robust toward model misspecifications. Hence, we warn against a mechanical use of MICE and suggest careful modeling of the conditional distributions of variables to ensure proper performance.

Original languageEnglish (US)
Pages (from-to)3417-3436
Number of pages20
JournalStatistics in Medicine
Issue number24
StatePublished - Oct 30 2018


  • Gibbs sampling
  • conditional modeling framework
  • general location model
  • interaction
  • log-normal distribution

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability


Dive into the research topics of 'Strategies for imputing missing covariates in accelerated failure time models'. Together they form a unique fingerprint.

Cite this