TY - JOUR
T1 - Creating Individual Surgeon Performance Assessments in a Statewide Hospital Surgical Quality Improvement Collaborative
AU - Quinn, Christopher M.
AU - Bilimoria, Karl Y.
AU - Chung, Jeanette W.
AU - Ko, Clifford Y.
AU - Cohen, Mark E.
AU - Stulberg, Jonah J.
N1 - Funding Information:
Support for this study: Agency for Healthcare Research and Quality grant no. R01HS024516-01.
Publisher Copyright:
© 2018 American College of Surgeons
PY - 2018/9
Y1 - 2018/9
N2 - Background: Surgeon performance profiling is of great interest to surgeons, hospitals, health plans, and the public, yet efforts to date have been contested, with stakeholders at odds over the selection, reliability, and validity of metrics used. We sought to create surgeon-level comparative assessments within the Illinois Surgical Quality Improvement Collaborative. Study Design: American College of Surgeons NSQIP data were obtained for 51 Illinois hospitals covering a 30-month period from 2014 to 2016. Surgeon-level, risk-adjusted outcomes rates were estimated from 3-level crossed random effects logistic regression models and classified as low, as expected, or high for each of 7 postoperative outcomes. Model intra-class correlations and provider-specific reliability statistics were calculated. Results: A total of 123,141 cases were analyzed for 2,724 surgeons. Median provider case volume was 17 (interquartile range 4 to 54). Overall crude complication rates ranged from 0.62% to 7.14% across the 7 outcomes investigated. Surgeon-level variance estimates were low (intra-class correlation coefficients between 0.007 and 0.074). No performance outliers were detected for 3 of the outcomes measures, while a small number of outliers were identified for any morbidity (11 surgeons), surgical site infection (10 surgeons), death or serious morbidity (8 surgeons), and reoperation (1 surgeon). Among all physicians, median reliability was below 0.1 for each outcome. Conclusions: Few individual surgeon performance outliers could be detected in NSQIP clinical registry data for a statewide hospital collaborative over a 30-month period using postoperative patient outcomes. Low surgeon-specific case volumes and minimal variance between surgeons may limit the utility of American College of Surgeons NSQIP outcomes measures for individual profiling. Alternative metrics, such as process measures, patient experience, composite measures, or technical skill assessments should be explored for surgeon-level measurement.
AB - Background: Surgeon performance profiling is of great interest to surgeons, hospitals, health plans, and the public, yet efforts to date have been contested, with stakeholders at odds over the selection, reliability, and validity of metrics used. We sought to create surgeon-level comparative assessments within the Illinois Surgical Quality Improvement Collaborative. Study Design: American College of Surgeons NSQIP data were obtained for 51 Illinois hospitals covering a 30-month period from 2014 to 2016. Surgeon-level, risk-adjusted outcomes rates were estimated from 3-level crossed random effects logistic regression models and classified as low, as expected, or high for each of 7 postoperative outcomes. Model intra-class correlations and provider-specific reliability statistics were calculated. Results: A total of 123,141 cases were analyzed for 2,724 surgeons. Median provider case volume was 17 (interquartile range 4 to 54). Overall crude complication rates ranged from 0.62% to 7.14% across the 7 outcomes investigated. Surgeon-level variance estimates were low (intra-class correlation coefficients between 0.007 and 0.074). No performance outliers were detected for 3 of the outcomes measures, while a small number of outliers were identified for any morbidity (11 surgeons), surgical site infection (10 surgeons), death or serious morbidity (8 surgeons), and reoperation (1 surgeon). Among all physicians, median reliability was below 0.1 for each outcome. Conclusions: Few individual surgeon performance outliers could be detected in NSQIP clinical registry data for a statewide hospital collaborative over a 30-month period using postoperative patient outcomes. Low surgeon-specific case volumes and minimal variance between surgeons may limit the utility of American College of Surgeons NSQIP outcomes measures for individual profiling. Alternative metrics, such as process measures, patient experience, composite measures, or technical skill assessments should be explored for surgeon-level measurement.
UR - http://www.scopus.com/inward/record.url?scp=85049744622&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049744622&partnerID=8YFLogxK
U2 - 10.1016/j.jamcollsurg.2018.06.002
DO - 10.1016/j.jamcollsurg.2018.06.002
M3 - Article
C2 - 29940332
AN - SCOPUS:85049744622
SN - 1072-7515
VL - 227
SP - 303-312.e3
JO - Journal of the American College of Surgeons
JF - Journal of the American College of Surgeons
IS - 3
ER -