Deep learning-aided decision support for diagnosis of skin disease across skin tones

Matthew Groh*, Omar Badri, Roxana Daneshjou, Arash Koochek, Caleb Harris, Luis R. Soenksen, P. Murali Doraiswamy, Rosalind Picard

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

32 Scopus citations

Abstract

Although advances in deep learning systems for image-based medical diagnosis demonstrate their potential to augment clinical decision-making, the effectiveness of physician–machine partnerships remains an open question, in part because physicians and algorithms are both susceptible to systematic errors, especially for diagnosis of underrepresented populations. Here we present results from a large-scale digital experiment involving board-certified dermatologists (n = 389) and primary-care physicians (n = 459) from 39 countries to evaluate the accuracy of diagnoses submitted by physicians in a store-and-forward teledermatology simulation. In this experiment, physicians were presented with 364 images spanning 46 skin diseases and asked to submit up to four differential diagnoses. Specialists and generalists achieved diagnostic accuracies of 38% and 19%, respectively, but both specialists and generalists were four percentage points less accurate for the diagnosis of images of dark skin as compared to light skin. Fair deep learning system decision support improved the diagnostic accuracy of both specialists and generalists by more than 33%, but exacerbated the gap in the diagnostic accuracy of generalists across skin tones. These results demonstrate that well-designed physician–machine partnerships can enhance the diagnostic accuracy of physicians, illustrating that success in improving overall diagnostic accuracy does not necessarily address bias.

Original languageEnglish (US)
Pages (from-to)573-583
Number of pages11
JournalNature Medicine
Volume30
Issue number2
DOIs
StatePublished - Feb 2024

Funding

We acknowledge Sermo for platform support to recruit physicians to participate in this experiment, Apollo Hospitals for forwarding invitations to their physicians, the participants for their time and care in participating in this experiment, MIT Media Lab member companies and the Harold Horowitz (1951) Student Research Fund for financial support, the All of Us research program and its participants, Bruke Wossenseged for excellent research assistance in the early phase of this research, T. Johnson and the Kellogg Research Support team for a replication review and D. Rand for comments on an early draft of this manuscript.

ASJC Scopus subject areas

  • General Biochemistry, Genetics and Molecular Biology

Fingerprint

Dive into the research topics of 'Deep learning-aided decision support for diagnosis of skin disease across skin tones'. Together they form a unique fingerprint.

Cite this