TY - JOUR
T1 - Machine Learning and External Validation of the IDENTIFY Risk Calculator for Patients with Haematuria Referred to Secondary Care for Suspected Urinary Tract Cancer
AU - IDENTIFY Study group
AU - Khadhouri, Sinan
AU - Hramyka, Artsiom
AU - Gallagher, Kevin
AU - Light, Alexander
AU - Ippoliti, Simona
AU - Edison, Marie
AU - Alexander, Cameron
AU - Kulkarni, Meghana
AU - Zimmermann, Eleanor
AU - Nathan, Arjun
AU - Orecchia, Luca
AU - Banthia, Ravi
AU - Piazza, Pietro
AU - Mak, David
AU - Pyrgidis, Nikolaos
AU - Narayan, Prabhat
AU - Abad Lopez, Pablo
AU - Nawaz, Faisal
AU - Tran, Trung Thanh
AU - Claps, Francesco
AU - Hogan, Donnacha
AU - Gomez Rivas, Juan
AU - Alonso, Santiago
AU - Chibuzo, Ijeoma
AU - Gutierrez Hidalgo, Beatriz
AU - Whitburn, Jessica
AU - Teoh, Jeremy
AU - Marcq, Gautier
AU - Szostek, Alexandra
AU - Bondad, Jasper
AU - Sountoulides, Petros
AU - Kelsey, Tom
AU - Kasivisvanathan, Veeru
AU - Orecchia, Luca
AU - Tijerina, Adan
AU - Simoes, Adrian
AU - Ali, Ahmed
AU - Nic an Riogh, Aisling
AU - Wong, Albert
AU - Kiciak, Alex
AU - Ridgway, Alexander
AU - Szostek, Alexandra
AU - Dhanasekaran, Ananda
AU - Cheong, Anderson
AU - Atayi, Andrew
AU - Ashpak, Ashna
AU - Gutierrez Hidalgo, Beatriz
AU - Teixeira, Bernardo
AU - Maria Scornajenghi, Carlo
AU - Patel, Hiten D.
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2024/12
Y1 - 2024/12
N2 - Background: The IDENTIFY study developed a model to predict urinary tract cancer using patient characteristics from a large multicentre, international cohort of patients referred with haematuria. In addition to calculating an individual's cancer risk, it proposes thresholds to stratify them into very-low-risk (<1%), low-risk (1–<5%), intermediate-risk (5–<20%), and high-risk (≥20%) groups. Objective: To externally validate the IDENTIFY haematuria risk calculator and compare traditional regression with machine learning algorithms. Design, setting, and participants: Prospective data were collected on patients referred to secondary care with new haematuria. Data were collected for patient variables included in the IDENTIFY risk calculator, cancer outcome, and TNM staging. Machine learning methods were used to evaluate whether better models than those developed with traditional regression methods existed. Outcome measurements and statistical analysis: The area under the receiver operating characteristic curve (AUC) for the detection of urinary tract cancer, calibration coefficient, calibration in the large (CITL), and Brier score were determined. Results and limitations: There were 3582 patients in the validation cohort. The development and validation cohorts were well matched. The AUC of the IDENTIFY risk calculator on the validation cohort was 0.78. This improved to 0.80 on a subanalysis of urothelial cancer prevalent countries alone, with a calibration slope of 1.04, CITL of 0.24, and Brier score of 0.14. The best machine learning model was Random Forest, which achieved an AUC of 0.76 on the validation cohort. There were no cancers stratified to the very-low-risk group in the validation cohort. Most cancers were stratified to the intermediate- and high-risk groups, with more aggressive cancers in higher-risk groups. Conclusions: The IDENTIFY risk calculator performed well at predicting cancer in patients referred with haematuria on external validation. This tool can be used by urologists to better counsel patients on their cancer risks, to prioritise diagnostic resources on appropriate patients, and to avoid unnecessary invasive procedures in those with a very low risk of cancer. Patient summary: We previously developed a calculator that predicts patients’ risk of cancer when they have blood in their urine, based on their personal characteristics. We have validated this risk calculator, by testing it on a separate group of patients to ensure that it works as expected. Most patients found to have cancer tended to be in the higher-risk groups and had more aggressive types of cancer with a higher risk. This tool can be used by clinicians to fast-track high-risk patients based on the calculator and investigate them more thoroughly.
AB - Background: The IDENTIFY study developed a model to predict urinary tract cancer using patient characteristics from a large multicentre, international cohort of patients referred with haematuria. In addition to calculating an individual's cancer risk, it proposes thresholds to stratify them into very-low-risk (<1%), low-risk (1–<5%), intermediate-risk (5–<20%), and high-risk (≥20%) groups. Objective: To externally validate the IDENTIFY haematuria risk calculator and compare traditional regression with machine learning algorithms. Design, setting, and participants: Prospective data were collected on patients referred to secondary care with new haematuria. Data were collected for patient variables included in the IDENTIFY risk calculator, cancer outcome, and TNM staging. Machine learning methods were used to evaluate whether better models than those developed with traditional regression methods existed. Outcome measurements and statistical analysis: The area under the receiver operating characteristic curve (AUC) for the detection of urinary tract cancer, calibration coefficient, calibration in the large (CITL), and Brier score were determined. Results and limitations: There were 3582 patients in the validation cohort. The development and validation cohorts were well matched. The AUC of the IDENTIFY risk calculator on the validation cohort was 0.78. This improved to 0.80 on a subanalysis of urothelial cancer prevalent countries alone, with a calibration slope of 1.04, CITL of 0.24, and Brier score of 0.14. The best machine learning model was Random Forest, which achieved an AUC of 0.76 on the validation cohort. There were no cancers stratified to the very-low-risk group in the validation cohort. Most cancers were stratified to the intermediate- and high-risk groups, with more aggressive cancers in higher-risk groups. Conclusions: The IDENTIFY risk calculator performed well at predicting cancer in patients referred with haematuria on external validation. This tool can be used by urologists to better counsel patients on their cancer risks, to prioritise diagnostic resources on appropriate patients, and to avoid unnecessary invasive procedures in those with a very low risk of cancer. Patient summary: We previously developed a calculator that predicts patients’ risk of cancer when they have blood in their urine, based on their personal characteristics. We have validated this risk calculator, by testing it on a separate group of patients to ensure that it works as expected. Most patients found to have cancer tended to be in the higher-risk groups and had more aggressive types of cancer with a higher risk. This tool can be used by clinicians to fast-track high-risk patients based on the calculator and investigate them more thoroughly.
KW - Bladder cancer
KW - Cancer risk
KW - Haematuria
KW - Prediction
KW - Predictive model
KW - Renal cancer
KW - Risk calculator
KW - Upper tract urothelial cancer
KW - Urinary tract cancer
KW - Validation
UR - http://www.scopus.com/inward/record.url?scp=85196633889&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196633889&partnerID=8YFLogxK
U2 - 10.1016/j.euf.2024.06.004
DO - 10.1016/j.euf.2024.06.004
M3 - Article
C2 - 38906722
AN - SCOPUS:85196633889
SN - 2405-4569
VL - 10
SP - 1034
EP - 1042
JO - European Urology Focus
JF - European Urology Focus
IS - 6
ER -