Supercomputing for the parallelization of whole genome analysis

Megan J. Puckelwartz, Lorenzo L. Pesce, Viswateja Nelakuditi, Lisa Dellefave-Castillo, Jessica R. Golbus, Sharlene M. Day, Thomas P. Cappola, Gerald W. Dorn, Ian T. Foster, Elizabeth M. Mcnally*

*Corresponding author for this work

Research output: Contribution to journalArticle

32 Scopus citations

Abstract

Motivation: The declining cost of generating DNA sequence is promoting an increase in whole genome sequencing, especially as applied to the human genome. Whole genome analysis requires the alignment and comparison of raw sequence data, and results in a computational bottleneck because of limited ability to analyze multiple genomes simultaneously. Results: We now adapted a Cray XE6 supercomputer to achieve the parallelization required for concurrent multiple genome analysis. This approach not only markedly speeds computational time but also results in increased usable sequence per genome. Relying on publically available software, the Cray XE6 has the capacity to align and call variants on 240 whole genomes in ∼50 h. Multisample variant calling is also accelerated.

Original languageEnglish (US)
Pages (from-to)1508-1513
Number of pages6
JournalBioinformatics
Volume30
Issue number11
DOIs
StatePublished - Jun 1 2014

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint Dive into the research topics of 'Supercomputing for the parallelization of whole genome analysis'. Together they form a unique fingerprint.

  • Cite this

    Puckelwartz, M. J., Pesce, L. L., Nelakuditi, V., Dellefave-Castillo, L., Golbus, J. R., Day, S. M., Cappola, T. P., Dorn, G. W., Foster, I. T., & Mcnally, E. M. (2014). Supercomputing for the parallelization of whole genome analysis. Bioinformatics, 30(11), 1508-1513. https://doi.org/10.1093/bioinformatics/btu071