Encoding visual attributes in capsules for explainable medical diagnoses

Rodney LaLonde*, Drew Torigian, Ulas Bagci

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations


Convolutional neural network based systems have largely failed to be adopted in many high-risk application areas, including healthcare, military, security, transportation, finance, and legal, due to their highly uninterpretable “black-box” nature. Towards solving this deficiency, we teach a novel multi-task capsule network to improve the explainability of predictions by embodying the same high-level language used by human-experts. Our explainable capsule network, X-Caps, encodes high-level visual object attributes within the vectors of its capsules, then forms predictions based solely on these human-interpretable features. To encode attributes, X-Caps utilizes a new routing sigmoid function to independently route information from child capsules to parents. Further, to provide radiologists with an estimate of model confidence, we train our network on a distribution of expert labels, modeling inter-observer agreement and punishing over/under confidence during training, supervised by human-experts’ agreement. X-Caps simultaneously learns attribute and malignancy scores from a multi-center dataset of over 1000 CT scans of lung cancer screening patients. We demonstrate a simple 2D capsule network can outperform a state-of-the-art deep dense dual-path 3D CNN at capturing visually-interpretable high-level attributes and malignancy prediction, while providing malignancy prediction scores approaching that of non-explainable 3D CNNs. To the best of our knowledge, this is the first study to investigate capsule networks for making predictions based on radiologist-level interpretable attributes and its applications to medical image diagnosis. Code is publicly available at https://github.com/lalonderodney/X-Caps.

Original languageEnglish (US)
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2020 - 23rd International Conference, Proceedings
EditorsAnne L. Martel, Purang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, Leo Joskowicz
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages11
ISBN (Print)9783030597092
StatePublished - 2020
Externally publishedYes
Event23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020 - Lima, Peru
Duration: Oct 4 2020Oct 8 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12261 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020


  • Capsule networks
  • Explainable AI
  • Lung cancer

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Encoding visual attributes in capsules for explainable medical diagnoses'. Together they form a unique fingerprint.

Cite this