Whole-slide imaging of histologic sections captures tissue microenvironments and cytologic details in expansive high-resolution images. These images can be mined to extract quantitative features that describe tissues, yielding measurements for hundreds of millions of histologic objects. A central challenge in utilizing this data is enabling investigators to train and evaluate classification rules for identifying objects related to processes like angiogenesis or immune response. In this paper we describe HistomicsML, an interactive machine-learning system for digital pathology imaging datasets. This framework uses active learning to direct user feedback, making classifier training efficient and scalable in datasets containing 108+ histologic objects. We demonstrate how this system can be used to phenotype microvascular structures in gliomas to predict survival, and to explore the molecular pathways associated with these phenotypes. Our approach enables researchers to unlock phenotypic information from digital pathology datasets to investigate prognostic image biomarkers and genotype-phenotype associations.
ASJC Scopus subject areas