TY - JOUR
T1 - PennDiff
T2 - Detecting differential alternative splicing and transcription by RNA sequencing
AU - Hu, Yu
AU - Lin, Jennie
AU - Hu, Jian
AU - Hu, Gang
AU - Wang, Kui
AU - Zhang, Hanrui
AU - Reilly, Muredach P.
AU - Li, Mingyao
N1 - Funding Information:
This research was supported by R01GM108600 and R01HL113147 to M.L., and R01HL113147 to M.P.R. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Publisher Copyright:
© The Author(s) 2018.
PY - 2018/7/15
Y1 - 2018/7/15
N2 - Motivation: Alternative splicing and alternative transcription are a major mechanism for generating transcriptome diversity. Differential alternative splicing and transcription (DAST), which describe different usage of transcript isoforms across different conditions, can complement differential expression in characterizing gene regulation. However, the analysis of DAST is challenging because only a small fraction of RNA-seq reads is informative for isoforms. Several methods have been developed to detect exon-based and gene-based DAST, but they suffer from power loss for genes with many isoforms. Results: We present PennDiff, a novel statistical method that makes use of information on gene structures and pre-estimated isoform relative abundances, to detect DAST from RNA-seq data. PennDiff has several advantages. First, grouping exons avoids multiple testing for 'exons' originated from the same isoform(s). Second, it utilizes all available reads in exon-inclusion level estimation, which is different from methods that only use junction reads. Third, collapsing isoforms sharing the same alternative exons reduces the impact of isoform expression estimation uncertainty. PennDiff is able to detect DAST at both exon and gene levels, thus offering more flexibility than existing methods. Simulations and analysis of a real RNA-seq dataset indicate that PennDiff has well-controlled type I error rate, and is more powerful than existing methods including DEXSeq, rMATS, Cuffdiff, IUTA and SplicingCompass. As the popularity of RNA-seq continues to grow, we expect PennDiff to be useful for diverse transcriptomics studies.
AB - Motivation: Alternative splicing and alternative transcription are a major mechanism for generating transcriptome diversity. Differential alternative splicing and transcription (DAST), which describe different usage of transcript isoforms across different conditions, can complement differential expression in characterizing gene regulation. However, the analysis of DAST is challenging because only a small fraction of RNA-seq reads is informative for isoforms. Several methods have been developed to detect exon-based and gene-based DAST, but they suffer from power loss for genes with many isoforms. Results: We present PennDiff, a novel statistical method that makes use of information on gene structures and pre-estimated isoform relative abundances, to detect DAST from RNA-seq data. PennDiff has several advantages. First, grouping exons avoids multiple testing for 'exons' originated from the same isoform(s). Second, it utilizes all available reads in exon-inclusion level estimation, which is different from methods that only use junction reads. Third, collapsing isoforms sharing the same alternative exons reduces the impact of isoform expression estimation uncertainty. PennDiff is able to detect DAST at both exon and gene levels, thus offering more flexibility than existing methods. Simulations and analysis of a real RNA-seq dataset indicate that PennDiff has well-controlled type I error rate, and is more powerful than existing methods including DEXSeq, rMATS, Cuffdiff, IUTA and SplicingCompass. As the popularity of RNA-seq continues to grow, we expect PennDiff to be useful for diverse transcriptomics studies.
UR - http://www.scopus.com/inward/record.url?scp=85045902435&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85045902435&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty097
DO - 10.1093/bioinformatics/bty097
M3 - Article
C2 - 29474557
AN - SCOPUS:85045902435
SN - 1367-4803
VL - 34
SP - 2384
EP - 2391
JO - Bioinformatics
JF - Bioinformatics
IS - 14
ER -