TY - CHAP
T1 - Hidden treasures in contemporary RNA sequencing
AU - Mangul, Serghei
AU - Yang, Harry Taegyun
AU - Eskin, Eleazar
AU - Zaitlen, Noah
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - High throughput RNA sequencing technologies have provided unprecedented opportunity to explore the individual transcriptome. Unmapped reads, the reads falling to map to the human reference, are a large and often overlooked output of standard RNA-Seq analyses; the hidden treasure in the contemporary RNA-Seq analysis is within the unmapped reads, illuminating previously unexplored biological insights. Here we develop Read Origin Protocol (ROP) to discover the source of all reads originating from complex RNA molecules, recombinant T and B cell receptors, and microbial communities. We applied ROP to 10,641 samples across 2630 individuals from 54 diverse adult human tissues. Our approach can account for 99.9% of 1 trillion reads of various read length. Using in-house RNA-Seq data, we show that immune profiles of asthmatic individuals are significantly different from the profiles of control individuals, with decreased average per sample T and B cell receptor diversity. We also show that microbiomes can be detected in human bloods via RNA-Sequencing and may elucidate important clinical changes in patients with schizophrenia. Furthermore, we demonstrate that receptor-derived reads among other hidden reads can be used to characterize the overall Ig repertoire across diverse human tissues using RNA-Sequencing. Our results demonstrate the potential of ROP to exploit the hidden treasure in contemporary RNA-Sequencing in order to better understand the functional mechanisms underlying connections between the immune system, microbiome, human gene expression, and disease etiology.
AB - High throughput RNA sequencing technologies have provided unprecedented opportunity to explore the individual transcriptome. Unmapped reads, the reads falling to map to the human reference, are a large and often overlooked output of standard RNA-Seq analyses; the hidden treasure in the contemporary RNA-Seq analysis is within the unmapped reads, illuminating previously unexplored biological insights. Here we develop Read Origin Protocol (ROP) to discover the source of all reads originating from complex RNA molecules, recombinant T and B cell receptors, and microbial communities. We applied ROP to 10,641 samples across 2630 individuals from 54 diverse adult human tissues. Our approach can account for 99.9% of 1 trillion reads of various read length. Using in-house RNA-Seq data, we show that immune profiles of asthmatic individuals are significantly different from the profiles of control individuals, with decreased average per sample T and B cell receptor diversity. We also show that microbiomes can be detected in human bloods via RNA-Sequencing and may elucidate important clinical changes in patients with schizophrenia. Furthermore, we demonstrate that receptor-derived reads among other hidden reads can be used to characterize the overall Ig repertoire across diverse human tissues using RNA-Sequencing. Our results demonstrate the potential of ROP to exploit the hidden treasure in contemporary RNA-Sequencing in order to better understand the functional mechanisms underlying connections between the immune system, microbiome, human gene expression, and disease etiology.
KW - B and T cell receptor immune repertoires
KW - Human Microbiome
KW - Immune system
KW - RNA Sequencing
KW - RNA aligners
KW - Unmapped reads
UR - http://www.scopus.com/inward/record.url?scp=85062921429&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062921429&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-13973-5_1
DO - 10.1007/978-3-030-13973-5_1
M3 - Chapter
AN - SCOPUS:85062921429
T3 - SpringerBriefs in Computer Science
SP - 1
EP - 93
BT - SpringerBriefs in Computer Science
PB - Springer
ER -