Community bioinformatic platform for insertion sequencing

  • Mandel, Mark (PD/PI)

Project: Research project

Project Details

Description

Transposon insertion sequencing has emerged as a high-resolution method to query bacterial gene function across diverse conditions. Insertion sequencing–abbreviated INSeq in this proposal–also encompasses the related techniques of Tn-seq, TraDIS, and HITS. INSeq has been applied to identify essential and conditionally essential genes, colonization factors, genes required for antibiotic resistance, and factors that are involved in other specific processes @Goodman2009, @vanOpijnen2009, @Gawronski2009, @Gallagher2011. Multiple groups have produced code that can conduct the basic analyses. However, this work has resulted in software packages without a model for community engagement and in some cases lack a license for others to modify, which together have hindered further development and discovery. As more data sets are produced that affect key medical processes, it will become increasingly important to establish a community in which software developers and biologists can together integrate data from INSeq studies and improve analyses in a collaborative fashion. Toward this end my laboratory (principally, me) has built an open-source (BSD 3-clause) package in the Python language, tentatively termed pyinseq. I have recruited computer programmers to contribute to this software, but as a geneticist I require additional mentorship to lead this effort into a mature, sophisticated software package that can address leading-edge questions in bacterial genetics and genomics. I therefore propose to use the travel award to catalyze the following specific aims during a series of visits to the laboratory of Dr. C. Titus Brown at the University of California-Davis. 1. Refine the current pyinseq package to conduct robust statistical analyses, rapid visualizations, and to connect to other genome analysis software. 2. Publish the pyinseq package in the Python Package Insex (PyPI), prepare detailed documentation for users, and publish the package in a scientific journal. 3. Establish a framework for community contribution including contribution guidelines, test-driven development, “hackathons” to encourage community contributions, a roadmap for future pyinseq development that specifically includes contributors with expertise in statistics and software optimization. A broader aim of this proposal is to develop a bacterial bioinformatics program in a genetics laboratory that is superbly equipped to train students at the interface of biology and informatics. My efforts to expand computational training opportunities at the Northwestern Chicago campus, including initiating training for over 300 scientists (see biosketch) will ensure that the impact of this proposal extends beyond my individual laboratory.
StatusFinished
Effective start/end date6/1/1612/31/17

Funding

  • Burroughs Wellcome Fund (Agreement 1016350)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.