Abstract
Motivation: RNA molecules can undergo complex structural dynamics, especially during transcription, which influence their biological functions. Recently developed high-throughput chemical probing experiments that study RNA cotranscriptional folding generate nucleotide-resolution 'reactivities' for each length of a growing nascent RNA that reflect structural dynamics. However, the manual annotation and qualitative interpretation of reactivity across these large datasets can be nuanced, laborious, and difficult for new practitioners. We developed a quantitative and systematic approach to automatically detect RNA folding events from these datasets to reduce human bias/error, standardize event discovery and generate hypotheses about RNA folding trajectories for further analysis and experimental validation. Results: Detection of Unknown Events with Tunable Thresholds (DUETT) identifies RNA structural transitions in cotranscriptional RNA chemical probing datasets. DUETT employs a feedback control-inspired method and a linear regression approach and relies on interpretable and independently tunable parameter thresholds to match qualitative user expectations with quantitatively identified folding events. We validate the approach by identifying known RNA structural transitions within the cotranscriptional folding pathways of the Escherichia coli signal recognition particle RNA and the Bacillus cereus crcB fluoride riboswitch. We identify previously overlooked features of these datasets such as heightened reactivity patterns in the signal recognition particle RNA about 12 nt lengths before base-pair rearrangement. We then apply a sensitivity analysis to identify tradeoffs when choosing parameter thresholds. Finally, we show that DUETT is tunable across a wide range of contexts, enabling flexible application to study broad classes of RNA folding mechanisms.
Original language | English (US) |
---|---|
Pages (from-to) | 5103-5112 |
Number of pages | 10 |
Journal | Bioinformatics |
Volume | 35 |
Issue number | 24 |
DOIs | |
State | Published - Dec 15 2019 |
Funding
This work was supported by New Innovator Award through the NIGMS of the National Institutes of Health [1DP2GM110838 to J.B.L.]; Searle Funds at the Chicago Community Trust (to J.B.L.); the Center of Cancer Nano-technology Excellence initiative of the NIH’s National Cancer Institute [U54 CA199091 to N.B.]; and Northwestern University’s Data Science Initiative Award (to N.B.). Support was also provided by the Northwestern University Graduate School Cluster in Biotechnology, Systems, and Synthetic Biology (to A.Y.X.); and Tri-Institutional Training Program in Computational Biology and Medicine [NIH training grant T32GM083937 to A.M.Y.].
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics