TY - JOUR
T1 - Containers in Bioinformatics
T2 - Applications, Practical Considerations, and Best Practices in Molecular Pathology
AU - Kadri, Sabah
AU - Sboner, Andrea
AU - Sigaras, Alexandros
AU - Roy, Somak
N1 - Publisher Copyright:
© 2022 Association for Molecular Pathology and American Society for Investigative Pathology
PY - 2022/5
Y1 - 2022/5
N2 - Systematic implementation of bioinformatics resources for next generation sequencing (NGS)-based clinical testing is an arduous undertaking. One of the key challenges involves developing an ecosystem of information technology infrastructure for enabling scalable and reproducible bioinformatics services that is resilient and secure for handling genetic and protected health information, often embedded in an existing non–bioinformatics-oriented infrastructure. Container technology provides an ideal and infrastructure-agnostic solution for molecular laboratories developing and using bioinformatics pipelines, whether on-premise or using the cloud. A container is a technology that provides a consistent computational environment and enables reproducibility, scalability, and security when developing NGS bioinformatics analysis pipelines. Containers can increase the bioinformatics team's productivity by automating and simplifying the maintenance of complex bioinformatics resources, as well as facilitate validation, version control, and documentation necessary for clinical laboratory regulatory compliance. Although there is increasing popularity in adopting containers for developing NGS bioinformatics pipelines, there is wide variability and inconsistency in the usage of containers that may result in suboptimal performance and potentially compromise the security and privacy of protected health information. In this article, the authors highlight the current state and provide best or recommended practices for building, using containers in NGS bioinformatics solutions in a clinical setting with focus on scalability, optimization, maintainability, and data security.
AB - Systematic implementation of bioinformatics resources for next generation sequencing (NGS)-based clinical testing is an arduous undertaking. One of the key challenges involves developing an ecosystem of information technology infrastructure for enabling scalable and reproducible bioinformatics services that is resilient and secure for handling genetic and protected health information, often embedded in an existing non–bioinformatics-oriented infrastructure. Container technology provides an ideal and infrastructure-agnostic solution for molecular laboratories developing and using bioinformatics pipelines, whether on-premise or using the cloud. A container is a technology that provides a consistent computational environment and enables reproducibility, scalability, and security when developing NGS bioinformatics analysis pipelines. Containers can increase the bioinformatics team's productivity by automating and simplifying the maintenance of complex bioinformatics resources, as well as facilitate validation, version control, and documentation necessary for clinical laboratory regulatory compliance. Although there is increasing popularity in adopting containers for developing NGS bioinformatics pipelines, there is wide variability and inconsistency in the usage of containers that may result in suboptimal performance and potentially compromise the security and privacy of protected health information. In this article, the authors highlight the current state and provide best or recommended practices for building, using containers in NGS bioinformatics solutions in a clinical setting with focus on scalability, optimization, maintainability, and data security.
UR - http://www.scopus.com/inward/record.url?scp=85129972173&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129972173&partnerID=8YFLogxK
U2 - 10.1016/j.jmoldx.2022.01.006
DO - 10.1016/j.jmoldx.2022.01.006
M3 - Review article
C2 - 35189355
AN - SCOPUS:85129972173
SN - 1525-1578
VL - 24
SP - 442
EP - 454
JO - Journal of Molecular Diagnostics
JF - Journal of Molecular Diagnostics
IS - 5
ER -