To facilitate the identification of contralateral breast cancer events for large cohort study, we proposed and implemented a new method based on features extracted from narrative text in progress notes and features from numbers of pathology reports for each side of breast cancer. Our method collects medical concepts and their combinations to detect contralateral events in progress notes. In addition, the numbers of pathology reports generated for either left or right side of breast cancer were derived as additional features. We experimented with support vector machine using the derived features to detect contralateral events. In the cross-validation and held-out tests, the area under curve score is 0.93 and 0.89 respectively. This method can be replicated due to the simplicity of feature generation.
|Original language||English (US)|
|Number of pages||8|
|Journal||AMIA ... Annual Symposium proceedings. AMIA Symposium|
|State||Published - 2017|
ASJC Scopus subject areas