TY - JOUR
T1 - A comparison of computer adaptive tests (CATs) and short forms in terms of accuracy and number of items administrated using PROMIS profile
AU - Segawa, Eisuke
AU - Schalet, Benjamin
AU - Cella, David
N1 - Funding Information:
This study was funded by National Institutes of Health (U2CCA186878, Recipient David Cella).
Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Purpose: In the Patient-Reported Outcomes Measurement Information System (PROMIS), seven domains (Physical Function, Anxiety, Depression, Fatigue, Sleep Disturbance, Social Function, and Pain Interference) are packaged together as profiles. Each of these domains can also be assessed using computer adaptive tests (CATs) or short forms (SFs) of varying length (e.g., 4, 6, and 8 items). We compared the accuracy and number of items administrated of CAT versus each SF. Methods: PROMIS instruments are scored using item response theory (IRT) with graded response model and reported as T scores (mean = 50, SD = 10). We simulated 10,000 subjects from the normal distribution with mean 60 for symptom scales and 40 for function scales, and standard deviation 10 in each domain. We considered a subject’s score to be accurate when the standard error (SE) was less than 3.0. We recorded range of accurate scores (accurate range) and the number of items administrated. Results: The average number of items administrated in CAT was 4.7 across all domains. The accurate range was wider for CAT compared to all SFs in each domain. CAT was notably better at extending the accurate range into very poor health for Fatigue, Physical Function, and Pain Interference. Most SFs provided reasonably wide accurate range. Conclusions: Relative to SFs, CATs provided the widest accurate range, with slightly more items than SF4 and less than SF6 and SF8. Most SFs, especially longer ones, provided reasonably wide accurate range.
AB - Purpose: In the Patient-Reported Outcomes Measurement Information System (PROMIS), seven domains (Physical Function, Anxiety, Depression, Fatigue, Sleep Disturbance, Social Function, and Pain Interference) are packaged together as profiles. Each of these domains can also be assessed using computer adaptive tests (CATs) or short forms (SFs) of varying length (e.g., 4, 6, and 8 items). We compared the accuracy and number of items administrated of CAT versus each SF. Methods: PROMIS instruments are scored using item response theory (IRT) with graded response model and reported as T scores (mean = 50, SD = 10). We simulated 10,000 subjects from the normal distribution with mean 60 for symptom scales and 40 for function scales, and standard deviation 10 in each domain. We considered a subject’s score to be accurate when the standard error (SE) was less than 3.0. We recorded range of accurate scores (accurate range) and the number of items administrated. Results: The average number of items administrated in CAT was 4.7 across all domains. The accurate range was wider for CAT compared to all SFs in each domain. CAT was notably better at extending the accurate range into very poor health for Fatigue, Physical Function, and Pain Interference. Most SFs provided reasonably wide accurate range. Conclusions: Relative to SFs, CATs provided the widest accurate range, with slightly more items than SF4 and less than SF6 and SF8. Most SFs, especially longer ones, provided reasonably wide accurate range.
KW - Computer adaptive testing (CAT)
KW - Item response theory
KW - PROMIS
KW - Short form
UR - http://www.scopus.com/inward/record.url?scp=85074423270&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074423270&partnerID=8YFLogxK
U2 - 10.1007/s11136-019-02312-8
DO - 10.1007/s11136-019-02312-8
M3 - Article
C2 - 31595451
AN - SCOPUS:85074423270
VL - 29
SP - 213
EP - 221
JO - Quality of Life Research
JF - Quality of Life Research
SN - 0962-9343
IS - 1
ER -