Advancing Interpretable Regression Analysis for Binary Data: A Novel Distributed Algorithm Approach

Jiayi Tong, Lu Li, Jenna Marie Reps, Vitaly Lorman, Naimin Jing, Mackenzie Edmondson, Xiwei Lou, Ravi Jhaveri, Kelly J. Kelleher, Nathan M. Pajor, Christopher B. Forrest, Jiang Bian, Haitao Chu, Yong Chen*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Sparse data bias, where there is a lack of sufficient cases, is a common problem in data analysis, particularly when studying rare binary outcomes. Although a two-step meta-analysis approach may be used to lessen the bias by combining the summary statistics to increase the number of cases from multiple studies, this method does not completely eliminate bias in effect estimation. In this paper, we propose a one-shot distributed algorithm for estimating relative risk using a modified Poisson regression for binary data, named ODAP-B. We evaluate the performance of our method through both simulation studies and real-world case analyses of postacute sequelae of SARS-CoV-2 infection in children using data from 184 501 children across eight national academic medical centers. Compared with the meta-analysis method, our method provides closer estimates of the relative risk for all outcomes considered including syndromic and systemic outcomes. Our method is communication-efficient and privacy-preserving, requiring only aggregated data to obtain relatively unbiased effect estimates compared with two-step meta-analysis methods. Overall, ODAP-B is an effective distributed learning algorithm for Poisson regression to study rare binary outcomes. The method provides inference on adjusted relative risk with a robust variance estimator.

Original languageEnglish (US)
Pages (from-to)5573-5582
Number of pages10
JournalStatistics in Medicine
Volume43
Issue number29
DOIs
StatePublished - Dec 20 2024

Funding

This work was supported by Patient\u2010Centered Outcomes Research Institute, ME\u20102018C3\u201014899, ME\u20102019C3\u201018315 and National Institutes of Health, U01TR003709, U24MH136069, RF1AG077820, 1R01LM014344, 1R01AG077820, R01LM012607, R01AI130460, R01AG073435, R56AG074604, R01LM013519, R01DK128237, R56AG069880, R21AI167418, R21EY034179. Funding: This study is part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, which seeks to understand, treat, and prevent the postacute sequelae of SARS\u2010CoV\u20102 infection (PASC). For more information on RECOVER, visit https://recovercovid.org/ . This research was supported in part by the National Institutes of Health (NIH) Agreement OT2HL161847\u201001 as part of the Researching COVID to Enhance Recovery (RECOVER) program of research. This work was supported in part by National Institutes of Health (1R01LM014344, 1R01AG077820, R01LM012607, R01AI130460, R01AG073435, R56AG074604, R01LM013519, R56AG069880, U01TR003709, RF1AG077820, R21AI167418, R21EY034179). This work was supported partially through Patient\u2010Centered Outcomes Research Institute (PCORI) Project Program Awards (ME\u20102019C3\u201018315 and ME\u20102018C3\u201014899). All statements in this report, including its findings and conclusions, are solely those of the authors and do not necessarily represent the views of the PCORI, its Board of Governors or Methodology Committee.

Keywords

  • binary data
  • distributed algorithm
  • modified Poisson regression
  • relative risk

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Advancing Interpretable Regression Analysis for Binary Data: A Novel Distributed Algorithm Approach'. Together they form a unique fingerprint.

Cite this