HapIso: An accurate method for the haplotype-specific isoforms reconstruction from long single-molecule reads

Serghei Mangul*, Harry Taegyun Yang, Farhad Hormozdiari, Elizabeth Tseng, Alex Zelikovsky, Eleazar Eskin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Sequencing of RNA provides the possibility to study an individual’s transcriptome landscape and determine allelic expression ratios. Single-molecule protocols generate multi-kilobase reads longer than most transcripts allowing sequencing of complete haplotype isoforms. This allows partitioning the reads into two parental haplotypes. While the read length of the single-molecule protocols is long, the relatively high error rate limits the ability to accurately detect the genetic variants and assemble them into the haplotype-specific isoforms. In this paper, we present HapIso (Haplotype-specific Isoform Reconstruction), a method able to tolerate the relatively high error-rate of the single-molecule platform and partition the isoform reads into the parental alleles. Phasing the reads according to the allele of origin allows our method to efficiently distinguish between the read errors and the true biological mutations. HapIso uses a k-means clustering algorithm aiming to group the reads into two meaningful clusters maximizing the similarity of the reads within cluster and minimizing the similarity of the reads from different clusters. Each cluster corresponds to a parental haplotype. We use family pedigree information to evaluate our approach. Experimental validation suggests that HapIso is able to tolerate the relatively high error-rate and accurately partition the reads into the parental alleles of the isoform transcripts. Furthermore, our method is the first method able to reconstruct the haplotype-specific isoforms from long single-molecule reads.

Original languageEnglish (US)
Title of host publicationBioinformatics Research and Applications - 12th International Symposium, ISBRA 2016, Proceedings
EditorsAnu Bourgeois, Pavel Skums, Xiang Wan, Alex Zelikovsky
PublisherSpringer Verlag
Pages80-92
Number of pages13
ISBN (Print)9783319387819
DOIs
StatePublished - 2016
Event12th International Symposium on Bioinformatics Research and Applications, ISBRA 2016 - Minsk, Belarus
Duration: Jun 5 2016Jun 8 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9683
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Symposium on Bioinformatics Research and Applications, ISBRA 2016
CountryBelarus
CityMinsk
Period6/5/166/8/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'HapIso: An accurate method for the haplotype-specific isoforms reconstruction from long single-molecule reads'. Together they form a unique fingerprint.

Cite this