TY - JOUR
T1 - Assessment Scores of a Mock Objective Structured Clinical Examination Administered to 99 Anesthesiology Residents at 8 Institutions
AU - Tanaka, Pedro
AU - Park, Yoon Soo
AU - Liu, Linda
AU - Varner, Chelsia
AU - Kumar, Amanda H.
AU - Sandhu, Charandip
AU - Yumul, Roya
AU - McCartney, Kate Tobin
AU - Spilka, Jared
AU - MacArio, Alex
N1 - Publisher Copyright:
Copyright © 2020 International Anesthesia Research Society.
PY - 2020/8/1
Y1 - 2020/8/1
N2 - BACKGROUND: Objective Structured Clinical Examinations (OSCEs) are used in a variety of high-stakes examinations. The primary goal of this study was to examine factors influencing the variability of assessment scores for mock OSCEs administered to senior anesthesiology residents. METHODS: Using the American Board of Anesthesiology (ABA) OSCE Content Outline as a blueprint, scenarios were developed for 4 of the ABA skill types: (1) informed consent, (2) treatment options, (3) interpretation of echocardiograms, and (4) application of ultrasonography. Eight residency programs administered these 4 OSCEs to CA3 residents during a 1-day formative session. A global score and checklist items were used for scoring by faculty raters. We used a statistical framework called generalizability theory, or G-theory, to estimate the sources of variation (or facets), and to estimate the reliability (ie, reproducibility) of the OSCE performance scores. Reliability provides a metric on the consistency or reproducibility of learner performance as measured through the assessment. RESULTS: Of the 115 total eligible senior residents, 99 participated in the OSCE because the other residents were unavailable. Overall, residents correctly performed 84% (standard deviation [SD] 16%, range 38%-100%) of the 36 total checklist items for the 4 OSCEs. On global scoring, the pass rate for the informed consent station was 71%, for treatment options was 97%, for interpretation of echocardiograms was 66%, and for application of ultrasound was 72%. The estimate of reliability expressing the reproducibility of examinee rankings equaled 0.56 (95% confidence interval [CI], 0.49-0.63), which is reasonable for normative assessments that aim to compare a resident's performance relative to other residents because over half of the observed variation in total scores is due to variation in examinee ability. Phi coefficient reliability of 0.42 (95% CI, 0.35-0.50) indicates that criterion-based judgments (eg, pass-fail status) cannot be made. Phi expresses the absolute consistency of a score and reflects how closely the assessment is likely to reproduce an examinee's final score. Overall, the greatest (14.6%) variance was due to the person by item by station interaction (3-way interaction) indicating that specific residents did well on some items but poorly on other items. The variance (11.2%) due to residency programs across case items was high suggesting moderate variability in performance from residents during the OSCEs among residency programs. CONCLUSIONS: Since many residency programs aim to develop their own mock OSCEs, this study provides evidence that it is possible for programs to create a meaningful mock OSCE experience that is statistically reliable for separating resident performance.
AB - BACKGROUND: Objective Structured Clinical Examinations (OSCEs) are used in a variety of high-stakes examinations. The primary goal of this study was to examine factors influencing the variability of assessment scores for mock OSCEs administered to senior anesthesiology residents. METHODS: Using the American Board of Anesthesiology (ABA) OSCE Content Outline as a blueprint, scenarios were developed for 4 of the ABA skill types: (1) informed consent, (2) treatment options, (3) interpretation of echocardiograms, and (4) application of ultrasonography. Eight residency programs administered these 4 OSCEs to CA3 residents during a 1-day formative session. A global score and checklist items were used for scoring by faculty raters. We used a statistical framework called generalizability theory, or G-theory, to estimate the sources of variation (or facets), and to estimate the reliability (ie, reproducibility) of the OSCE performance scores. Reliability provides a metric on the consistency or reproducibility of learner performance as measured through the assessment. RESULTS: Of the 115 total eligible senior residents, 99 participated in the OSCE because the other residents were unavailable. Overall, residents correctly performed 84% (standard deviation [SD] 16%, range 38%-100%) of the 36 total checklist items for the 4 OSCEs. On global scoring, the pass rate for the informed consent station was 71%, for treatment options was 97%, for interpretation of echocardiograms was 66%, and for application of ultrasound was 72%. The estimate of reliability expressing the reproducibility of examinee rankings equaled 0.56 (95% confidence interval [CI], 0.49-0.63), which is reasonable for normative assessments that aim to compare a resident's performance relative to other residents because over half of the observed variation in total scores is due to variation in examinee ability. Phi coefficient reliability of 0.42 (95% CI, 0.35-0.50) indicates that criterion-based judgments (eg, pass-fail status) cannot be made. Phi expresses the absolute consistency of a score and reflects how closely the assessment is likely to reproduce an examinee's final score. Overall, the greatest (14.6%) variance was due to the person by item by station interaction (3-way interaction) indicating that specific residents did well on some items but poorly on other items. The variance (11.2%) due to residency programs across case items was high suggesting moderate variability in performance from residents during the OSCEs among residency programs. CONCLUSIONS: Since many residency programs aim to develop their own mock OSCEs, this study provides evidence that it is possible for programs to create a meaningful mock OSCE experience that is statistically reliable for separating resident performance.
UR - http://www.scopus.com/inward/record.url?scp=85088178755&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85088178755&partnerID=8YFLogxK
U2 - 10.1213/ANE.0000000000004705
DO - 10.1213/ANE.0000000000004705
M3 - Article
C2 - 32149757
AN - SCOPUS:85088178755
SN - 0003-2999
VL - 131
SP - 613
EP - 621
JO - Anesthesia and analgesia
JF - Anesthesia and analgesia
IS - 2
ER -