Towards Reliable Dermatology Evaluation Benchmarks

Fabian Gröger, Simone Lionetti, Philippe Gottfrois, Alvaro Gonzalez-Jimenez, Matthew Groh, Roxana Daneshjou, Labelling Consortium, Alexander A. Navarini, Marc Pouly

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

Benchmark datasets for digital dermatology unwittingly contain inaccuracies that reduce trust in model performance estimates. We propose a resource-efficient data-cleaning protocol to identify issues that escaped previous curation. The protocol leverages an existing algorithmic cleaning strategy and is followed by a confirmation process terminated by an intuitive stopping criterion. Based on confirmation by multiple dermatologists, we remove irrelevant samples and near duplicates and estimate the percentage of label errors in six dermatology image datasets for model evaluation promoted by the International Skin Imaging Collaboration. Along with this paper, we publish revised file lists for each dataset which should be used for model evaluation.1 Our work paves the way for more trustworthy performance assessment in digital dermatology.

Original languageEnglish (US)
Pages (from-to)101-128
Number of pages28
JournalProceedings of Machine Learning Research
Volume225
StatePublished - 2023
Event3rd Machine Learning for Health Symposium, ML4H 2023 - New Orleans, United States
Duration: Dec 10 2023 → …

Funding

We want to thank Xu Cao, who supported us as a mentor during the final stages of paper writing, and Wenqian Ye for reviewing our paper. Additionally, we want to thank all the anonymous reviewers for their insightful and thoughtful comments.

Keywords

  • Benchmark datasets
  • Data-centric AI
  • Dataset cleaning
  • Dermatology

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Towards Reliable Dermatology Evaluation Benchmarks'. Together they form a unique fingerprint.

Cite this