TY - JOUR
T1 - Universal machine learning framework for defect predictions in zinc blende semiconductors
AU - Mannodi-Kanakkithodi, Arun
AU - Xiang, Xiaofeng
AU - Jacoby, Laura
AU - Biegaj, Robert
AU - Dunham, Scott T.
AU - Gamelin, Daniel R.
AU - Chan, Maria K.Y.
N1 - Funding Information:
This work was performed in part at the Center for Nanoscale Materials, a US Department of Energy Office of Science User Facility, and supported by the US Department of Energy, Office of Science, under Contract No. DE-AC02-06CH11357. A.M.-K. X.X. and L.J. contributed equally to this work. X.X. L.J. and R.B. acknowledge support from the Data Intensive Research Enabling Clean Technology (DIRECT) NSF National Research Traineeship. X.X. L.J. D.G. and S.D. acknowledge support from the UW Molecular Engineering Materials Center (DMR 1719797), an NSF Materials Research Science and Engineering Center. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231. We gratefully acknowledge the computing resources provided on Bebop, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. S.D. and X.X. acknowledge funding from the U.S. Department of Energy, award number DE-EE0008556. A.M.-K. acknowledges support from the School of Materials Engineering at Purdue University under account number F.10023800.05.002. M.K.Y.C. and A.M.-K. conceived the idea. A.M.-K. and M.K.Y.C. performed the DFT computations. X.X. L.J. R.B. and A.M.-K. performed the machine learning analysis. All authors contributed to the discussion and writing of the manuscript. The authors declare no competing interests.
Funding Information:
This work was performed in part at the Center for Nanoscale Materials, a US Department of Energy Office of Science User Facility, and supported by the US Department of Energy , Office of Science, under Contract No. DE-AC02-06CH11357 . A.M.-K., X.X., and L.J. contributed equally to this work. X.X., L.J., and R.B. acknowledge support from the Data Intensive Research Enabling Clean Technology (DIRECT) NSF National Research Traineeship. X.X., L.J., D.G., and S.D. acknowledge support from the UW Molecular Engineering Materials Center ( DMR 1719797 ), an NSF Materials Research Science and Engineering Center. This research used resources of the National Energy Research Scientific Computing Center , a DOE Office of Science User Facility supported by the Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231 . We gratefully acknowledge the computing resources provided on Bebop, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory . S.D. and X.X. acknowledge funding from the U.S. Department of Energy , award number DE-EE0008556 . A.M.-K. acknowledges support from the School of Materials Engineering at Purdue University under account number F.10023800.05.002 .
Publisher Copyright:
© 2022 The Authors
PY - 2022/3/11
Y1 - 2022/3/11
N2 - We develop a framework powered by machine learning (ML) and high-throughput density functional theory (DFT) computations for the prediction and screening of functional impurities in groups IV, III–V, and II–VI zinc blende semiconductors. Elements spanning the length and breadth of the periodic table are considered as impurity atoms at the cation, anion, or interstitial sites in supercells of 34 candidate semiconductors, leading to a chemical space of approximately 12,000 points, 10% of which are used to generate a DFT dataset of charge dependent defect formation energies. Descriptors based on tabulated elemental properties, defect coordination environment, and relevant semiconductor properties are used to train ML regression models for the DFT computed neutral state formation energies and charge transition levels of impurities. Optimized kernel ridge, Gaussian process, random forest, and neural network regression models are applied to screen impurities with lower formation energy than dominant native defects in all compounds.
AB - We develop a framework powered by machine learning (ML) and high-throughput density functional theory (DFT) computations for the prediction and screening of functional impurities in groups IV, III–V, and II–VI zinc blende semiconductors. Elements spanning the length and breadth of the periodic table are considered as impurity atoms at the cation, anion, or interstitial sites in supercells of 34 candidate semiconductors, leading to a chemical space of approximately 12,000 points, 10% of which are used to generate a DFT dataset of charge dependent defect formation energies. Descriptors based on tabulated elemental properties, defect coordination environment, and relevant semiconductor properties are used to train ML regression models for the DFT computed neutral state formation energies and charge transition levels of impurities. Optimized kernel ridge, Gaussian process, random forest, and neural network regression models are applied to screen impurities with lower formation energy than dominant native defects in all compounds.
KW - combinatorial screening
KW - computational materials science
KW - density functional theory
KW - DSML 3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems
KW - high-throughput data
KW - machine learning
KW - materials informatics
KW - mid-gap states
KW - point defects
KW - semiconductors
UR - http://www.scopus.com/inward/record.url?scp=85125621186&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125621186&partnerID=8YFLogxK
U2 - 10.1016/j.patter.2022.100450
DO - 10.1016/j.patter.2022.100450
M3 - Article
C2 - 35510195
AN - SCOPUS:85125621186
SN - 2666-3899
VL - 3
JO - Patterns
JF - Patterns
IS - 3
M1 - 100450
ER -