Abstract
Background: High throughput RNA sequencing (RNA-Seq) can generate whole transcriptome information at the single transcript level providing a powerful tool with multiple interrelated applications including transcriptome reconstruction and quantification. The sequences of novel transcripts can be reconstructed from deep RNA-Seq data, but this is computationally challenging due to sequencing errors, uneven coverage of expressed transcripts, and the need to distinguish between highly similar transcripts produced by alternative splicing. Another challenge in transcriptomic analysis comes from the ambiguities in mapping reads to transcripts. Results: We present MaLTA, a method for simultaneous transcriptome assembly and quantification from Ion Torrent RNA-Seq data. Our approach explores transcriptome structure and incorporates a maximum likelihood model into the assembly and quantification procedure. A new version of the IsoEM algorithm suitable for Ion Torrent RNA-Seq reads is used to accurately estimate transcript expression levels. The MaLTA-IsoEM tool is publicly available at: http://alan.cs.gsu.edu/NGS/?q=malta Conclusions: Experimental results on both synthetic and real datasets show that Ion Torrent RNA-Seq data can be successfully used for transcriptome analyses. Experimental results suggest increased transcriptome assembly and quantification accuracy of MaLTA-IsoEM solution compared to existing state-of-the-art approaches.
Original language | English (US) |
---|---|
Article number | S7 |
Journal | BMC Genomics |
Volume | 15 |
DOIs | |
State | Published - 2014 |
Externally published | Yes |
Funding
DB is a member of the Ion Bioinformatics group at Life Technologies Corporation. The work of S.M., A.C., S.A.S., I.M. and A.Z. was supported in part by Life Technology Grants “Novel transcript reconstruction from Ion Torrent sequencing” and “Viral Metagenome Reconstruction Software for Ion Torrent PGM Sequencer”. The authors recognize the presence of potential conflicts of interest and affirm that the results reported in this paper represent original and unbiased observations. S.M., A.C., S.A.S., I.M. and A.Z. were supported in part by Agriculture and Food Research Initiative Competitive Grant no. 201167016-30331 from the USDA National Institute of Food and Agriculture and by Life Technology Grants "Novel transcript reconstruction from Ion Torrent sequencing" and "Viral Metagenome Reconstruction Software for Ion Torrent PGM Sequencer". S.M., A.C. and A.Z. were supported in part by NSF award IIS-0916401. I.M. was supported in part by NSF award IIS-0916948. S.M. is supported by National Science Foundation grants 0513612, 0731455, 0729049, 0916676, 1065276, 1302448 and 1320589, and National Institutes of Health grants K25-HL080079, U01-DA024417, P01-HL30568, P01-HL28481, R01-GM083198, R01-MH101782 and R01-ES022282. S.M. was supported in part by Institute for Quantitative & Computational Biosciences Fellowship, UCLA and Second Century Initiative Bioinformatics University Doctoral Fellowship, Georgia State University. A.C. was supported in part by Molecular Basis of Disease Fellowship, Georgia State University.
ASJC Scopus subject areas
- Biotechnology
- Genetics