Abstract
Sparse data bias, where there is a lack of sufficient cases, is a common problem in data analysis, particularly when studying rare binary outcomes. Although a two-step meta-analysis approach may be used to lessen the bias by combining the summary statistics to increase the number of cases from multiple studies, this method does not completely eliminate bias in effect estimation. In this paper, we propose a one-shot distributed algorithm for estimating relative risk using a modified Poisson regression for binary data, named ODAP-B. We evaluate the performance of our method through both simulation studies and real-world case analyses of postacute sequelae of SARS-CoV-2 infection in children using data from 184 501 children across eight national academic medical centers. Compared with the meta-analysis method, our method provides closer estimates of the relative risk for all outcomes considered including syndromic and systemic outcomes. Our method is communication-efficient and privacy-preserving, requiring only aggregated data to obtain relatively unbiased effect estimates compared with two-step meta-analysis methods. Overall, ODAP-B is an effective distributed learning algorithm for Poisson regression to study rare binary outcomes. The method provides inference on adjusted relative risk with a robust variance estimator.
Original language | English (US) |
---|---|
Pages (from-to) | 5573-5582 |
Number of pages | 10 |
Journal | Statistics in Medicine |
Volume | 43 |
Issue number | 29 |
DOIs | |
State | Published - Dec 20 2024 |
Funding
This work was supported by Patient\u2010Centered Outcomes Research Institute, ME\u20102018C3\u201014899, ME\u20102019C3\u201018315 and National Institutes of Health, U01TR003709, U24MH136069, RF1AG077820, 1R01LM014344, 1R01AG077820, R01LM012607, R01AI130460, R01AG073435, R56AG074604, R01LM013519, R01DK128237, R56AG069880, R21AI167418, R21EY034179. Funding: This study is part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, which seeks to understand, treat, and prevent the postacute sequelae of SARS\u2010CoV\u20102 infection (PASC). For more information on RECOVER, visit https://recovercovid.org/ . This research was supported in part by the National Institutes of Health (NIH) Agreement OT2HL161847\u201001 as part of the Researching COVID to Enhance Recovery (RECOVER) program of research. This work was supported in part by National Institutes of Health (1R01LM014344, 1R01AG077820, R01LM012607, R01AI130460, R01AG073435, R56AG074604, R01LM013519, R56AG069880, U01TR003709, RF1AG077820, R21AI167418, R21EY034179). This work was supported partially through Patient\u2010Centered Outcomes Research Institute (PCORI) Project Program Awards (ME\u20102019C3\u201018315 and ME\u20102018C3\u201014899). All statements in this report, including its findings and conclusions, are solely those of the authors and do not necessarily represent the views of the PCORI, its Board of Governors or Methodology Committee.
Keywords
- binary data
- distributed algorithm
- modified Poisson regression
- relative risk
ASJC Scopus subject areas
- Epidemiology
- Statistics and Probability