Reporting valid and reliable overall scores and domain scores

Lihua Yao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

30 Scopus citations


In educational assessment, overall scores obtained by simply averaging a number of domain scores are sometimes reported. However, simply averaging the domain scores ignores the fact that different domains have different score points, that scores from those domains are related, and that at different score points the relationship between overall score and domain score may be different. To report reliable and valid overall scores and domain scores, I investigated the performance of four methods using both real and simulation data: (a) the unidimensional IRT model; (b) the higher-order IRT model, which simultaneously estimates the overall ability and domain abilities; (c) the multidimensional IRT (MIRT) model, which estimates domain abilities and uses the maximum information method to obtain the overall ability; and (d) the bifactor general model. My findings suggest that the MIRT model not only provides reliable domain scores, but also produces reliable overall scores. The overall score from the MIRT maximum information method has the smallest standard error of measurement. In addition, unlike the other models, there is no linear relationship assumed between overall score and domain scores. Recommendations for sizes of correlations between domains and the number of items needed for reporting purposes are provided.

Original languageEnglish (US)
Pages (from-to)339-360
Number of pages22
JournalJournal of Educational Measurement
Issue number3
StatePublished - Sep 2010

ASJC Scopus subject areas

  • Education
  • Developmental and Educational Psychology
  • Applied Psychology
  • Psychology (miscellaneous)


Dive into the research topics of 'Reporting valid and reliable overall scores and domain scores'. Together they form a unique fingerprint.

Cite this