Model validation metrics have been developed to provide a quantitative measure that characterizes the agreement between predictions and observations. The metrics become useful for model selection when alternative models are being considered. Additionally, they are utilized when performing accuracy assessments before and after model improvement. Due to the various sources of uncertainties in both computer simulations and physical experiments, model validation must be conducted based on stochastic characteristics. Currently there is no unified validation metric that is widely accepted. In this paper, we present a classification of validation metrics based on their key characteristics along with a discussion of the desired features. In the category of stochastic validation with the consideration of uncertainty in both predictions and physical experiments, four main types of metrics, namely classical hypothesis testing, Bayes factor, frequentist's metric, and area metric, are examined to provide a better understanding of the pros and cons of each. Numerical examples are used to illustrate the differences in these metrics, and to examine how sensitive these metrics are with respect to the experimental data size, uncertainty from measurement error, and uncertainty in unknown model parameters. The insight gained from this work provides useful guidelines for choosing the appropriate metric in model validation.