Using machine learning and qualitative interviews to design a five-question survey module for women's agency

Seema Jayachandran*, Monica Biradavolu, Jan Cooper

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Open-ended interview questions elicit rich information about people's lives, but in large-scale surveys, social scientists often need to measure complex concepts using only a few close-ended questions. We propose a new method to design a short survey measure for such cases by combining mixed-methods data collection and machine learning. We identify the best survey questions based on how well they predict a benchmark measure of the concept derived from qualitative interviews. We apply the method to create a survey module and index for women's agency. We measure agency for 209 married women in Haryana, India, first, through a semi-structured interview and, second, through a large set of close-ended questions. We use qualitative coding methods to score each woman's agency based on the interview, which we use as a benchmark measure of agency. To determine the close-ended questions most predictive of the benchmark, we apply statistical algorithms that build on LASSO and random forest but constrain how many variables are selected for the model (five in our case). The resulting five-question index is as strongly correlated with the coded qualitative interview as is an index that uses all of the candidate questions. This approach of selecting survey questions based on their statistical correspondence to coded qualitative interviews could be used to design short survey modules for many other latent constructs.

Original languageEnglish (US)
Article number106076
JournalWorld Development
StatePublished - Jan 2023


  • Women's empowerment
  • feature selection
  • psychometrics
  • survey design

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Development
  • Economics and Econometrics
  • Building and Construction
  • Sociology and Political Science


Dive into the research topics of 'Using machine learning and qualitative interviews to design a five-question survey module for women's agency'. Together they form a unique fingerprint.

Cite this