Grooming a CAT: customizing CAT administration rules to increase response efficiency in specific research and clinical settings

Michael A. Kallen*, Karon F. Cook, Dagmar Amtmann, Elizabeth Knowlton, Richard C. Gershon

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Purpose: To evaluate the degree to which applying alternative stopping rules would reduce response burden while maintaining score precision in the context of computer adaptive testing (CAT). Data: Analyses were conducted on secondary data comprised of CATs administered in a clinical setting at multiple time points (baseline and up to two follow ups) to 417 study participants who had back pain (51.3%) and/or depression (47.0%). Participant mean age was 51.3 years (SD = 17.2) and ranged from 18 to 86. Participants tended to be white (84.7%), relatively well educated (77% with at least some college), female (63.9%), and married or living in a committed relationship (57.4%). The unit of analysis was individual assessment histories (i.e., CAT item response histories) from the parent study. Data were first aggregated across all individuals, domains, and time points in an omnibus dataset of assessment histories and then were disaggregated by measure for domain-specific analyses. Finally, assessment histories within a “clinically relevant range” (score ≥ 1 SD from the mean in direction of poorer health) were analyzed separately to explore score level-specific findings. Method: Two different sets of CAT administration rules were compared. The original CAT (CATORIG) rules required at least four and no more than 12 items be administered. If the score standard error (SE) reached a value < 3 points (T score metric) before 12 items were administered, the CAT was stopped. We simulated applying alternative stopping rules (CATALT), removing the requirement that a minimum four items be administered, and stopped a CAT if responses to the first two items were both associated with best health, if the SE was < 3, if SE change < 0.1 (T score metric), or if 12 items were administered. We then compared score fidelity and response burden, defined as number of items administered, between CATORIG and CATALT. Results: CATORIG and CATALT scores varied little, especially within the clinically relevant range, and response burden was substantially lower under CATALT (e.g., 41.2% savings in omnibus dataset). Conclusions: Alternate stopping rules result in substantial reductions in response burden with minimal sacrifice in score precision.

Original languageEnglish (US)
Pages (from-to)2403-2413
Number of pages11
JournalQuality of Life Research
Issue number9
StatePublished - Sep 1 2018


  • CAT stopping rules
  • Computer adaptive testing
  • Response burden

ASJC Scopus subject areas

  • Public Health, Environmental and Occupational Health

Fingerprint Dive into the research topics of 'Grooming a CAT: customizing CAT administration rules to increase response efficiency in specific research and clinical settings'. Together they form a unique fingerprint.

Cite this