A divide-and-conquer strategy to solve the out-of-memory problem of processing thousands of Affymetrix microarrays

Chia-Hu Lee, Dong Fu, Pan Du, Hongmei Jiang, Simon M Lin, Warren Kibbe

Research output: Contribution to journalArticle

Abstract

Out-of-memory problem was frequently encountered when processing thousands of CEL files using Bioconductor. We propose a divide-and-conquer strategy combined with randomised resampling to solve this problem. The CAMDA 2007 META-analysis data set which contains 5896 CEL files was used to test the approach on a typical commodity computer cluster by running established pre-processing algorithms for Affymetrix arrays in the Bioconductor package. The results were validated against a golden standard obtained by using a supercomputer. In addition to the performance improvement, the general divide-and-conquer strategy can be applied to any other normalisation algorithms without modifying the underlying implementation.
Original languageEnglish (US)
Pages (from-to)396-405
Number of pages10
JournalInternational Journal of Computational Biology and Drug Design
Volume1
Issue number4
StatePublished - 2008

Fingerprint Dive into the research topics of 'A divide-and-conquer strategy to solve the out-of-memory problem of processing thousands of Affymetrix microarrays'. Together they form a unique fingerprint.

  • Cite this