Reliability, validity, and feasibility of the zwisch scale for the assessment of intraoperative performance

Brian C. George*, Ezra Nathaniel Teitelbaum, Shari Lynn Meyerson, Mary C. Schuller, Debra DaRosa, Emil R. Petrusa, Lucia Catherine Petito, Jonathan Paul Fryer

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

199 Scopus citations


Purpose The existing methods for evaluating resident operative performance interrupt the workflow of the attending physician, are resource intensive, and are often completed well after the end of the procedure in question. These limitations lead to low faculty compliance and potential significant recall bias. In this study, we deployed a smartphone-based system, the Procedural Autonomy and Supervisions System, to facilitate assessment of resident performance according to the Zwisch scale with minimal workflow disruption. We aimed to demonstrate that this is a reliable, valid, and feasible method of measuring resident operative autonomy.

Methods Before implementation, general surgery residents and faculty underwent frame-of-reference training to the Zwisch scale. Immediately after any operation in which a resident participated, the system automatically sent a text message prompting the attending physician to rate the resident's level of operative autonomy according to the 4-level Zwisch scale. Of these procedures, 8 were videotaped and independently rated by 2 additional surgeons. The Zwisch ratings of the 3 raters were compared using an intraclass correlation coefficient. Videotaped procedures were also scored using 2 alternative operating room (OR) performance assessment instruments (Operative Performance Rating System and Ottawa Surgical Competency OR Evaluation), against which the item correlations were calculated.

Results Between December 2012 and June 2013, 27 faculty used the smartphone system to complete 1490 operative performance assessments on 31 residents. During this period, faculty completed evaluations for 92% of all operations performed with general surgery residents. The Zwisch scores were shown to correlate with postgraduate year (PGY) levels based on sequential pairwise chi-squared tests: PGY 1 vs PGY 2 (χ2 = 106.9, df = 3, p < 0.001); PGY 2 vs PGY 3 (χ2 = 22.2, df = 3, p < 0.001); and PGY 3 vs PGY 4 (χ2 = 56.4, df = 3, p < 0.001). Comparison of PGY 4 to PGY 5 scores were not significantly different (χ2 = 4.5, df = 3, p = 0.21). For the 8 operations reviewed for interrater reliability, the intraclass correlation coefficient was 0.90 (95% CI: 0.72-0.98, p < 0.01). Correlation of Procedural Autonomy and Supervisions System ratings with both Operative Performance Rating System items (each r > 0.90, all p's < 0.01) and Ottawa Surgical Competency OR Evaluation items (each r > 0.86, all p's < 0.01) was high.

Conclusions The Zwisch scale can be used to make reliable and valid measurements of faculty guidance and resident autonomy. Our data also suggest that Zwisch ratings may be used to infer resident operative performance. Deployed on an automated smartphone-based system, it can be used to feasibly record evaluations for most operations performed by residents. This information can be used to council individual residents, modify programmatic curricula, and potentially inform national training guidelines.

Original languageEnglish (US)
Pages (from-to)e90-e96
JournalJournal of Surgical Education
Issue number6
StatePublished - Nov 1 2014


  • educa- tional measurement
  • evaluation
  • graduate medical education
  • surgery
  • surgical education

ASJC Scopus subject areas

  • Surgery
  • Education


Dive into the research topics of 'Reliability, validity, and feasibility of the zwisch scale for the assessment of intraoperative performance'. Together they form a unique fingerprint.

Cite this