Identifying hotspots in lung cancer data using association rule mining

Ankit Agrawal*, Alok Choudhary

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Scopus citations

Abstract

We analyze the lung cancer data available from the SEER program with the aim of identifying hotspots using association rule mining techniques. A subset of 13 patient attributes from the SEER data were recently linked with the survival outcome using prediction models, which is used in this study for segmentation. The goal here is to identify characteristics of patient segments where average survival is significantly higher/lower than average survival across the entire dataset. Automated association rule mining techniques resulted in hundreds of rules, from which many redundant rules were manually removed based on domain knowledge. The resulting rules conform with existing biomedical knowledge and provide interesting insights into lung cancer survival.

Original languageEnglish (US)
Title of host publicationProceedings - 11th IEEE International Conference on Data Mining Workshops, ICDMW 2011
Pages995-1002
Number of pages8
DOIs
StatePublished - 2011
Event11th IEEE International Conference on Data Mining Workshops, ICDMW 2011 - Vancouver, BC, Canada
Duration: Dec 11 2011Dec 11 2011

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other11th IEEE International Conference on Data Mining Workshops, ICDMW 2011
Country/TerritoryCanada
CityVancouver, BC
Period12/11/1112/11/11

Keywords

  • Association rule mining
  • Hotspots
  • Lung cancer

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Identifying hotspots in lung cancer data using association rule mining'. Together they form a unique fingerprint.

Cite this