TY - GEN
T1 - FANGS
T2 - 25th Annual ACM Symposium on Applied Computing, SAC 2010
AU - Misra, Sanchit
AU - Narayanan, Ramanathan
AU - Lin, Simon
AU - Choudhary, Alok Nidhi
PY - 2010/7/23
Y1 - 2010/7/23
N2 - Next Generation Sequencing machines are generating millions of short DNA sequences (reads) everyday. There is a need for efficient algorithms to map these sequences to the reference genome to identify SNPs or rare transcripts and to fulfill the dream of personalized medicine. We present a Fast Algorithm for Next Generation Sequencers (FANGS), which dynamically reduces the search space by using q-gram filtering and pigeon hole principle to rapidly map 454-Roche reads onto a reference genome. FANGS is a sequential algorithm designed to find all the matches of a query sequence in the reference genome tolerating a large number of mismatches or insertions/deletions. Using FANGS, we mapped 50000 reads with a total of 25 million nucleotides to the human genome in as little as 23.3 minutes on a typical desktop computer. Through our experiments, we found that FANGS is upto an order of magnitude faster than the state-of-the-art techniques for queries of length 500 allowing 5 mismatches or insertion/deletions.
AB - Next Generation Sequencing machines are generating millions of short DNA sequences (reads) everyday. There is a need for efficient algorithms to map these sequences to the reference genome to identify SNPs or rare transcripts and to fulfill the dream of personalized medicine. We present a Fast Algorithm for Next Generation Sequencers (FANGS), which dynamically reduces the search space by using q-gram filtering and pigeon hole principle to rapidly map 454-Roche reads onto a reference genome. FANGS is a sequential algorithm designed to find all the matches of a query sequence in the reference genome tolerating a large number of mismatches or insertions/deletions. Using FANGS, we mapped 50000 reads with a total of 25 million nucleotides to the human genome in as little as 23.3 minutes on a typical desktop computer. Through our experiments, we found that FANGS is upto an order of magnitude faster than the state-of-the-art techniques for queries of length 500 allowing 5 mismatches or insertion/deletions.
KW - 454 sequencers
KW - next generation sequencers
KW - sequence mapping
UR - http://www.scopus.com/inward/record.url?scp=77954743459&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954743459&partnerID=8YFLogxK
U2 - 10.1145/1774088.1774419
DO - 10.1145/1774088.1774419
M3 - Conference contribution
AN - SCOPUS:77954743459
SN - 9781605586380
T3 - Proceedings of the ACM Symposium on Applied Computing
SP - 1539
EP - 1546
BT - APPLIED COMPUTING 2010 - The 25th Annual ACM Symposium on Applied Computing
Y2 - 22 March 2010 through 26 March 2010
ER -