Determining risk and predictors of head and neck cancer treatment-related lymphedema: A clinicopathologic and dosimetric data mining approach using interpretable machine learning and ensemble feature selection

P. Troy Teo*, Kevin Rogacki, Mahesh Gopalakrishnan, Indra J. Das, Mohamed E. Abazeed, Bharat B. Mittal, Michelle S Gentile

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Background and purpose: The ability to determine the risk and predictors of lymphedema is vital in improving the quality of life for head and neck (HN) cancer patients. However, selecting robust features is challenging due to the multicollinearity and high dimensionality of radiotherapy (RT) data. This study aims to overcome these challenges using an ensemble feature selection technique with machine learning (ML). Materials and methods: Thirty organs-at-risk, including bilateral cervical lymph node levels, were contoured, and dose-volume data were extracted from 76 HN treatment plans. Clinicopathologic data was collected. Ensemble feature selection was used to reduce the number of features. Using the reduced features as input to ML and competing risk models, internal and external lymphedema prediction capability was evaluated with the ML models, and time to lymphedema event and risk stratification were estimated using the risk models. Results: Two ML models, XGBoost and random forest, exhibited robust prediction performance. They achieved average F1-scores and AUCs of 84 ± 3.3 % and 79 ± 11.9 % (external lymphedema), and 64 ± 12 % and 78 ± 7.9 % (internal lymphedema). Predictive ML and risk models identified common predictors, including bulky node involvement, high dose to various lymph node levels, and lymph nodes removed during surgery. At 180 days, removing 0–25, 26–50, and > 50 lymph nodes increased external lymphedema risk to 72.1 %, 95.6 %, and 57.7 % respectively (p = 0.01). Conclusion: Our approach, involving the reduction of HN RT data dimensionality, resulted in effective ML models for HN lymphedema prediction. Predictive dosimetric features emerged from both predictive and competing risk models. Consistency with clinicopathologic features from other studies supports our methodology.

Original languageEnglish (US)
Article number100747
JournalClinical and Translational Radiation Oncology
Volume46
DOIs
StatePublished - May 2024

Funding

P. Troy Teo received fellowship funding from the Canadian Institute of Health Research (CIHR-472392). Mohamed Abazeed received funding from NIH R37CA222294, NIH P30CA060553, and the American Lung Association LCD-565365 for work outside the scope of the submitted manuscript. We gratefully acknowledge the consultation on R and Python/Scikit-learn programming provided by Colby Witherup, Diego Gomez-Zara, and Julianne Murphy from Northwestern IT Research Computing Service.

Keywords

  • Assessments, risk
  • Early onset lymphedema
  • Explainable AI
  • Head and neck cancer
  • Interpretable AI
  • Lymphedema
  • Machine learning
  • Oropharyngeal cancer
  • Radiation dose response relationship

ASJC Scopus subject areas

  • Oncology
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Determining risk and predictors of head and neck cancer treatment-related lymphedema: A clinicopathologic and dosimetric data mining approach using interpretable machine learning and ensemble feature selection'. Together they form a unique fingerprint.

Cite this