Machine learning of the cardiac phenome and skin transcriptome to categorize heart disease in systemic sclerosis

Monique E Hinchcliff, Tracy M. Frech, Tammara A. Wood, Chiang Ching Huang, Jungwha Lee, Kathleen Aren, John J. Ryan, Brent Wilson, Lauren Beussink-Nelson, Michael L. Whitfield, Rahul C. Deo*, Sanjiv J. Shah

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Background: Cardiac involvement is a leading cause of death in systemic sclerosis (SSc/scleroderma). The complexity of SSc cardiac manifestations is not fully captured by the current clinical SSc classification, which is based on extent of skin involvement and specific autoantibodies. Therefore, we sought to develop a clinically relevant SSc cardiac disease classification to improve clinical care and increase understanding of SSc cardiac disease pathobiology. We hypothesized that machine learning could identify novel SSc cardiac disease subgroups, and that gene expression assessment of skin could provide insights into molecular pathogenesis of these SSc pheno-groups. Methods: We used unsupervised model-based clustering (phenomapping) of SSc patient echocardiographic and clinical data to identify clinically relevant SSc pheno-groups in a discovery cohort (n=316), and validated these findings in an external SSc validation cohort (n=67). Cox regression was used to evaluate survival differences among groups. Gene expression profiles from skin biopsies from a subset of SSc patients (n=68) and controls (n=18) were analyzed with weighted gene co-expression network analyses to identify gene modules that were associated with cardiac pheno-groups and echocardiographic parameters. Results: Four SSc cardiac pheno-groups were identified with distinct profiles. Pheno-group #1 displayed a predominant cutaneous phenotype without cardiac involvement; pheno-group #2 had long-standing SSc with limited skin and cardiac involvement; pheno-group #3 had diffuse skin involvement, a high frequency of interstitial lung disease (88%), and significant right heart remodeling/dysfunction; and pheno-group #4 had prolonged SSc disease duration, limited skin involvement, and marked biventricular cardiac involvement. After multivariable adjustment, pheno-group #3 (hazard ratio [HR] 7.8, 95% confidence interval [CI] 1.5-33.0) and pheno-group #4 (HR 10.5, 95% CI 2.1-52.7) remained associated with mortality (P<0.05). The addition of pheno-group classification was additive to conventional survival models (P<0.05 by likelihood ratio test for all models), a finding that was replicated in the validation cohort. Skin gene expression analysis identified 2 gene modules (representing fibrosis and skin integrity, respectively) that differed among the cardiac pheno-groups and were associated with specific echocardiographic parameters. Conclusions: Machine learning of echocardiographic and skin gene expression data in SSc identifies clinically relevant subgroups with distinct cardiac phenotypes, survival, and associated molecular pathways in skin.

Original languageEnglish (US)
JournalUnknown Journal
StatePublished - Nov 3 2017


  • Echocardiography
  • Gene expression
  • Machine learning
  • Skin
  • Systemic sclerosis

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)
  • Immunology and Microbiology(all)
  • Neuroscience(all)
  • Pharmacology, Toxicology and Pharmaceutics(all)

Fingerprint Dive into the research topics of 'Machine learning of the cardiac phenome and skin transcriptome to categorize heart disease in systemic sclerosis'. Together they form a unique fingerprint.

Cite this