TY - JOUR
T1 - An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes
AU - Robey, Matthew T.
AU - Caesar, Lindsay K.
AU - Drott, Milton T.
AU - Keller, Nancy P.
AU - Kelleher, Neil L.
N1 - Funding Information:
ACKNOWLEDGMENTS. This work was supported by the National Center for Complementary and Integrative Health (R01 AT009143). L.K.C. was supported by the National Institute of General Medical Sciences (F32 GM134679). M.T.D. was supported by US Department of Agriculture, National Institute of Food and Agriculture postdoctoral fellowship award 2019-67012-29662.
Publisher Copyright:
© 2021 National Academy of Sciences. All rights reserved.
PY - 2021/5/11
Y1 - 2021/5/11
N2 - Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10- fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.
AB - Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10- fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.
KW - Biosynthesis
KW - Fungi
KW - Genome mining
KW - Natural products
KW - Secondary metabolism
UR - http://www.scopus.com/inward/record.url?scp=85105360465&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105360465&partnerID=8YFLogxK
U2 - 10.1073/pnas.2020230118
DO - 10.1073/pnas.2020230118
M3 - Article
C2 - 33941694
AN - SCOPUS:85105360465
SN - 0027-8424
VL - 118
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 19
M1 - e2020230118
ER -