An Integer Programming approach to novel transcript reconstruction from paired-end RNA-Seq reads

Serghei Mangul*, Adrian Caciula, Sahar Al Seesi, Dumitru Brinza, Abdul Rouf Banday, Rahul Kanadia

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Scopus citations

Abstract

Massively parallel whole transcriptome sequencing, commonly referred to as RNA-Seq, has become the technology of choice for performing gene expression profiling. However, reconstruction of full-length novel transcripts from RNA-Seq data remains challenging due to the short read length delivered by most existing sequencing technologies. We propose a novel statistical genome-guided method called "Transcriptome Reconstruction using Integer Programming" (TRIP) that incorporates fragment length distribution into novel transcript reconstruction from paired-end RNA-Seq reads. TRIP creates a splice graph based on aligned RNA-Seq reads and enumerates all maximal paths corresponding to putative transcripts. The problem of selecting true transcripts is formulated as an integer program (IP) which minimizes the set of selected transcripts yielding a good statistical fit between the fragment length distribution (empirically determined during library preparation) and fragment lengths implied by mapped read pairs. Experimental results on both real and synthetic datasets show that TRIP is more accurate than methods ignoring fragment length distribution information.

Original languageEnglish (US)
Title of host publication2012 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2012
Pages369-376
Number of pages8
DOIs
StatePublished - 2012
Event2012 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2012 - Orlando, FL, United States
Duration: Oct 7 2012Oct 10 2012

Publication series

Name2012 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2012

Conference

Conference2012 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2012
Country/TerritoryUnited States
CityOrlando, FL
Period10/7/1210/10/12

Keywords

  • Algorithms

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Information Management

Fingerprint

Dive into the research topics of 'An Integer Programming approach to novel transcript reconstruction from paired-end RNA-Seq reads'. Together they form a unique fingerprint.

Cite this