Mixture model normalization for non-targeted gas chromatography/mass spectrometry metabolomics data

Anna C. Reisetter, Michael J. Muehlbauer, James R. Bain, Michael Nodzenski, Robert D. Stevens, Olga Ilkayeva, Boyd E. Metzger, Christopher B. Newgard, William L. Lowe, Denise M. Scholtens*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

31 Scopus citations


Background: Metabolomics offers a unique integrative perspective for health research, reflecting genetic and environmental contributions to disease-related phenotypes. Identifying robust associations in population-based or large-scale clinical studies demands large numbers of subjects and therefore sample batching for gas-chromatography/mass spectrometry (GC/MS) non-targeted assays. When run over weeks or months, technical noise due to batch and run-order threatens data interpretability. Application of existing normalization methods to metabolomics is challenged by unsatisfied modeling assumptions and, notably, failure to address batch-specific truncation of low abundance compounds. Results: To curtail technical noise and make GC/MS metabolomics data amenable to analyses describing biologically relevant variability, we propose mixture model normalization (mixnorm) that accommodates truncated data and estimates per-metabolite batch and run-order effects using quality control samples. Mixnorm outperforms other approaches across many metrics, including improved correlation of non-targeted and targeted measurements and superior performance when metabolite detectability varies according to batch. For some metrics, particularly when truncation is less frequent for a metabolite, mean centering and median scaling demonstrate comparable performance to mixnorm. Conclusions: When quality control samples are systematically included in batches, mixnorm is uniquely suited to normalizing non-targeted GC/MS metabolomics data due to explicit accommodation of batch effects, run order and varying thresholds of detectability. Especially in large-scale studies, normalization is crucial for drawing accurate conclusions from non-targeted GC/MS metabolomics data.

Original languageEnglish (US)
Article number84
JournalBMC bioinformatics
Issue number1
StatePublished - Feb 2 2017


  • Batch effects
  • GC/MS
  • Gas chromatography/mass spectrometry
  • Metabolomics
  • Non-targeted
  • Normalization

ASJC Scopus subject areas

  • Applied Mathematics
  • Molecular Biology
  • Structural Biology
  • Biochemistry
  • Computer Science Applications


Dive into the research topics of 'Mixture model normalization for non-targeted gas chromatography/mass spectrometry metabolomics data'. Together they form a unique fingerprint.

Cite this