TY - JOUR
T1 - Graded response model fit, measurement invariance and (comparative) precision of the Dutch-Flemish PROMIS® Upper Extremity V2.0 item bank in patients with upper extremity disorders
AU - Lameijer, C. M.
AU - Van Bruggen, S. G.J.
AU - Haan, E. J.A.
AU - Van Deurzen, D. F.P.
AU - Van Der Elst, K.
AU - Stouten, V.
AU - Kaat, A. J.
AU - Roorda, L. D.
AU - Terwee, C. B.
N1 - Publisher Copyright:
© 2020 The Author(s).
PY - 2020/3/16
Y1 - 2020/3/16
N2 - Background: The Dutch-Flemish PROMIS® Upper Extremity (DF-PROMIS-UE) V2.0 item bank was recently developed using Item Response Theory (IRT). Unknown for this bank are: (1) if it is legitimate to calculate IRT-based scores for short forms and Computerized Adaptive Tests (CATs), which requires that the items meet the assumptions of and fit the IRT-model (Graded Response Model [GRM]);(2) if it is legitimate to compare (sub) groups of patients using this measure, which requires measurement invariance; and (3) the precision of the estimated patients' scores for patients with different levels of functioning and compared to legacy measures. Aims were to evaluate (1) the assumptions of and fit to the GRM, (2) measurement invariance and (3) (comparative) precision of the DF-PROMIS-UE v2.0. Methods: Cross-sectional data were collected in Dutch patients with upper extremity disorders. Assessed were IRT-assumptions (unidimensionality [bi-factor analysis], local independence [residual correlations], monotonicity [coefficient H]), GRM item fit, measurement invariance (absence of Differential Item Functioning [DIF] due to age, gender, center, duration, and location of complaints) and precision (standard error of IRT-based scores across levels of functioning). To study measurement invariance for language [Dutch vs. English], additional US data were used. Legacy instruments were the Disability of the Arm, Shoulder and Hand (DASH), the QuickDASH and the Michigan Hand Questionnaire (MHQ). Results: In total 521 Dutch (mean age ± SD = 51 ± 17 years, 49% female) and 246 US patients (mean age ± SD = 48 ± 14 years, 69% female) participated. The DF-PROMIS-UE v2.0 item bank was sufficiently unidimensional (Omega-H = 0.80, Explained Common Variance = 0.68), had negligible local dependence (four out of 1035 correlations > 0.20), good monotonicity (H = 0.63), good GRM fit (no misfitting items) and demonstrated sufficient measurement invariance. Precise estimates (Standard Error < 3.2) were obtained for most patients (7-item short form, 88.5%; standard CAT, 91.3%; and, fixed 7-item CAT, 87.6%). The DASH displayed better reliability than the DF-PROMIS-UE short form and standard CAT, the QuickDASH displayed comparable reliability. The MHQ-ADL displayed better reliability than the DF-PROMIS-UE short form and standard CAT for T-scores between 28 and 50. For patients with low function, the DF-PROMIS-UE measures performed better. Conclusions: The DF-PROMIS-UE v2.0 item bank showed sufficient psychometric properties in Dutch patients with UE disorders.
AB - Background: The Dutch-Flemish PROMIS® Upper Extremity (DF-PROMIS-UE) V2.0 item bank was recently developed using Item Response Theory (IRT). Unknown for this bank are: (1) if it is legitimate to calculate IRT-based scores for short forms and Computerized Adaptive Tests (CATs), which requires that the items meet the assumptions of and fit the IRT-model (Graded Response Model [GRM]);(2) if it is legitimate to compare (sub) groups of patients using this measure, which requires measurement invariance; and (3) the precision of the estimated patients' scores for patients with different levels of functioning and compared to legacy measures. Aims were to evaluate (1) the assumptions of and fit to the GRM, (2) measurement invariance and (3) (comparative) precision of the DF-PROMIS-UE v2.0. Methods: Cross-sectional data were collected in Dutch patients with upper extremity disorders. Assessed were IRT-assumptions (unidimensionality [bi-factor analysis], local independence [residual correlations], monotonicity [coefficient H]), GRM item fit, measurement invariance (absence of Differential Item Functioning [DIF] due to age, gender, center, duration, and location of complaints) and precision (standard error of IRT-based scores across levels of functioning). To study measurement invariance for language [Dutch vs. English], additional US data were used. Legacy instruments were the Disability of the Arm, Shoulder and Hand (DASH), the QuickDASH and the Michigan Hand Questionnaire (MHQ). Results: In total 521 Dutch (mean age ± SD = 51 ± 17 years, 49% female) and 246 US patients (mean age ± SD = 48 ± 14 years, 69% female) participated. The DF-PROMIS-UE v2.0 item bank was sufficiently unidimensional (Omega-H = 0.80, Explained Common Variance = 0.68), had negligible local dependence (four out of 1035 correlations > 0.20), good monotonicity (H = 0.63), good GRM fit (no misfitting items) and demonstrated sufficient measurement invariance. Precise estimates (Standard Error < 3.2) were obtained for most patients (7-item short form, 88.5%; standard CAT, 91.3%; and, fixed 7-item CAT, 87.6%). The DASH displayed better reliability than the DF-PROMIS-UE short form and standard CAT, the QuickDASH displayed comparable reliability. The MHQ-ADL displayed better reliability than the DF-PROMIS-UE short form and standard CAT for T-scores between 28 and 50. For patients with low function, the DF-PROMIS-UE measures performed better. Conclusions: The DF-PROMIS-UE v2.0 item bank showed sufficient psychometric properties in Dutch patients with UE disorders.
KW - Dutch-Flemish PROMIS
KW - Item response theory
KW - Measurement invariance
KW - Reliability
KW - Upper extremity
UR - http://www.scopus.com/inward/record.url?scp=85082018968&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082018968&partnerID=8YFLogxK
U2 - 10.1186/s12891-020-3178-8
DO - 10.1186/s12891-020-3178-8
M3 - Article
C2 - 32178644
AN - SCOPUS:85082018968
SN - 1471-2474
VL - 21
JO - BMC Musculoskeletal Disorders
JF - BMC Musculoskeletal Disorders
IS - 1
M1 - 170
ER -