Mixture model normalization for non-targeted gas chromatography/mass spectrometry metabolomics data

Anna C. Reisetter, Michael J. Muehlbauer, James R. Bain, Michael Nodzenski, Robert D. Stevens, Olga Ilkayeva, Boyd E. Metzger, Christopher B. Newgard, William L. Lowe, Denise M. Scholtens*

*Corresponding author for this work

Research output: Contribution to journalArticle

17 Scopus citations

Abstract

Background: Metabolomics offers a unique integrative perspective for health research, reflecting genetic and environmental contributions to disease-related phenotypes. Identifying robust associations in population-based or large-scale clinical studies demands large numbers of subjects and therefore sample batching for gas-chromatography/mass spectrometry (GC/MS) non-targeted assays. When run over weeks or months, technical noise due to batch and run-order threatens data interpretability. Application of existing normalization methods to metabolomics is challenged by unsatisfied modeling assumptions and, notably, failure to address batch-specific truncation of low abundance compounds. Results: To curtail technical noise and make GC/MS metabolomics data amenable to analyses describing biologically relevant variability, we propose mixture model normalization (mixnorm) that accommodates truncated data and estimates per-metabolite batch and run-order effects using quality control samples. Mixnorm outperforms other approaches across many metrics, including improved correlation of non-targeted and targeted measurements and superior performance when metabolite detectability varies according to batch. For some metrics, particularly when truncation is less frequent for a metabolite, mean centering and median scaling demonstrate comparable performance to mixnorm. Conclusions: When quality control samples are systematically included in batches, mixnorm is uniquely suited to normalizing non-targeted GC/MS metabolomics data due to explicit accommodation of batch effects, run order and varying thresholds of detectability. Especially in large-scale studies, normalization is crucial for drawing accurate conclusions from non-targeted GC/MS metabolomics data.

Original languageEnglish (US)
Article number84
JournalBMC bioinformatics
Volume18
Issue number1
DOIs
StatePublished - Feb 2 2017

    Fingerprint

Keywords

  • Batch effects
  • GC/MS
  • Gas chromatography/mass spectrometry
  • Metabolomics
  • Non-targeted
  • Normalization

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Reisetter, A. C., Muehlbauer, M. J., Bain, J. R., Nodzenski, M., Stevens, R. D., Ilkayeva, O., Metzger, B. E., Newgard, C. B., Lowe, W. L., & Scholtens, D. M. (2017). Mixture model normalization for non-targeted gas chromatography/mass spectrometry metabolomics data. BMC bioinformatics, 18(1), [84]. https://doi.org/10.1186/s12859-017-1501-7