Although discrete choice analysis has been shown to be useful for modeling consumer preferences and choice behaviors in the field of engineering design, information of choice set composition is often not available in majority of the collected consumer purchase data. When a large set of choice alternatives exist for a product, such as automotive vehicles, randomly choosing a small set of product alternatives to form a choice set for each individual consumer will result in misleading choice modeling results. In this work, we propose a data-analytics approach to mine existing data of choice sets and predict the choice set for each individual customer in a new choice modeling scenario where the choice set information is lacking. The proposed data-analytics approach integrates product association analysis, network analysis, consumer segmentation, and predictive analytics. Using the J.D. Power vehicle survey as the existing choice set data, we demonstrate that the association network approach is capable of recognizing and expressively summarizing meaningful product relations in choice sets. Our method accounts for consumer heterogeneity using the stochastic generation algorithm where the probability of selecting an alternative into a choice set integrates the information of customer profile clusters and products chosen frequencies. By comparing multiple multinomial logit models using different choice set compositions, we show that the choice model estimates are sensitive to the choice set compositions and our proposed method leads to improved modeling results. Our method also provides insights into market segmentation that can guide engineering design decisions.