EMBEDR: Distinguishing signal from noise in single-cell omics data

Eric M. Johnson, William Kath, Madhav Mani*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Single-cell “omics”-based measurements are often high dimensional so that dimensionality reduction (DR) algorithms are necessary for data visualization and analysis. The lack of methods for separating signal from noise in DR outputs has limited their utility in generating data-driven discoveries in single-cell data. In this work we present EMBEDR, which assesses the output of any DR algorithm to distinguish evidence of structure from algorithm-induced noise in DR outputs. We apply EMBEDR to DR-generated representations of single-cell omics data of several modalities to show where they visually show real—not spurious—structure. EMBEDR generates a “p” value for each sample, allowing for direct comparisons of DR algorithms and facilitating optimization of algorithm hyperparameters. We show that the scale of a sample's neighborhood can thus be determined and used to generate a novel “cell-wise optimal” embedding. EMBEDR is available as a Python package for immediate use.

Original languageEnglish (US)
Article number100443
JournalPatterns
Volume3
Issue number3
DOIs
StatePublished - Mar 11 2022

Keywords

  • ATAC-seq
  • DSML 1: Concept: Basic principles of a new data science output observed and reported
  • UMAP
  • cell-type identification
  • clustering
  • data visualization
  • dimensionality reduction
  • quality assessment
  • single-cell RNA sequencing
  • single-cell analysis
  • t-SNE

ASJC Scopus subject areas

  • Decision Sciences(all)

Fingerprint

Dive into the research topics of 'EMBEDR: Distinguishing signal from noise in single-cell omics data'. Together they form a unique fingerprint.

Cite this