Quality, quantity and harmony: The DataSHaPER approach to integrating data across bioclinical studies

Isabel Fortier*, Paul R. Burton, Paula J. Robson, Vincent Ferretti, Julian Little, Francois L'Heureux, Mylène Deschênes, Bartha M. Knoppers, Dany Doiron, Joost C. Keers, Pamela Linksted, Jennifer R. Harris, Geneviève Lachance, Catherine Boileau, Nancy L. Pedersen, Carol M. Hamilton, Kristian Hveem, Marilyn J. Borugian, Richard P. Gallagher, John McLaughlinLouise Parker, John D. Potter, John Gallacher, Rudolf Kaaks, Bette Liu, Tim Sprosen, Anne Vilain, Susan A. Atkinson, Andrea Rengifo, Robin Morton, Andres Metspalu, H. Erich Wichmann, Mark Tremblay, Rex L. Chisholm, Andrés Garcia-Montero, Hans Hillege, Jan Eric Litton, Lyle J. Palmer, Markus Perola, Bruce H.R. Wolffenbuttel, Leena Peltonen, Thomas J. Hudson

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

151 Scopus citations

Abstract

Background: Vast sample sizes are often essential in the quest to disentangle the complex interplay of the genetic, lifestyle, environmental and social factors that determine the aetiology and progression of chronic diseases. The pooling of information between studies is therefore of central importance to contemporary bioscience. However, there are many technical, ethico-legal and scientific challenges to be overcome if an effective, valid, pooled analysis is to be achieved. Perhaps most critically, any data that are to be analysed in this way must be adequately 'harmonized'. This implies that the collection and recording of information and data must be done in a manner that is sufficiently similar in the different studies to allow valid synthesis to take place. Methods: This conceptual article describes the origins, purpose and scientific foundations of the DataSHaPER (DataSchema and Harmonization Platform for Epidemiological Research; http://www.datashaper.org), which has been created by a multidisciplinary consortium of experts that was pulled together and coordinated by three international organizations: P3G (Public Population Project in Genomics), PHOEBE (Promoting Harmonization of Epidemiological Biobanks in Europe) and CPT (Canadian Partnership for Tomorrow Project). Results: The DataSHaPER provides a flexible, structured approach to the harmonization and pooling of information between studies. Its two primary components, the 'DataSchema' and 'Harmonization Platforms', together support the preparation of effective data-collection protocols and provide a central reference to facilitate harmonization. The DataSHaPER supports both 'prospective' and 'retrospective' harmonization. Conclusion: It is hoped that this article will encourage readers to investigate the project further: the more the research groups and studies are actively involved, the more effective the DataSHaPER programme will ultimately be. Published by Oxford University Press on behalf of the International Epidemiological Association

Original languageEnglish (US)
Pages (from-to)1383-1393
Number of pages11
JournalInternational journal of epidemiology
Volume39
Issue number5
DOIs
StatePublished - Oct 2010

Funding

Genome Canada and Genome Quebec (The Public Population Project in Genomics); Canadian Partnership Against Cancer (CPT); European FP6 (LSHG-CT-2006-518418 to Promoting Harmonization of Epidemiological Biobanks in Europe); Medical Research Council Project Grant (G0601625; methods programme in genetic epidemiology at the University of Leicester that focuses on genetic statistics and large-scale data harmonization and pooling); Wellcome Trust Supplementary Grant (086160/Z/08/A); Leverhulme Trust Research Fellowship (RF/9/RFG/ 2009/0062); National Institute for Health Research (Leicester Biomedical Research Unit in Cardiovascular Science); German Federal Ministry of Education and Research (BMBF) in the context of the German National Genome Research Network (NGFN-2 and NGFN-plus) (to E.W.); German Federal Ministry of Education and Research (BMBF) (Model attempt for networking in German research consortiadevelopment of a common concept for biobanks); European Framework 7 (Biobanking and Biomolecular Resources Research Infrastructure); J.L. is a Canada Research Chair in Human Genome Epidemiology.

Keywords

  • Data pooling
  • Data quality
  • Data synthesis
  • DataSHaPER
  • Harmonization
  • Metaanalysis
  • Prospective harmonization
  • Retrospective harmonization

ASJC Scopus subject areas

  • Epidemiology

Fingerprint

Dive into the research topics of 'Quality, quantity and harmony: The DataSHaPER approach to integrating data across bioclinical studies'. Together they form a unique fingerprint.

Cite this