Crystal Site Feature Embedding Enables Exploration of Large Chemical Spaces

Hitarth Choubisa, Mikhail Askerka, Kevin Ryczko, Oleksandr Voznyy, Kyle Mills, Isaac Tamblyn*, Edward H. Sargent

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

Mapping materials science problems onto computational frameworks suitable for machine learning can accelerate materials discovery. Combining proposed crystal site feature embedding (CSFE) representation with convolutional and extensive deep neural networks, we achieve a low mean absolute test error of 3.7 meV/atom and 0.069 eV on density functional theory energies and band gaps of mixed halide perovskites. We explore how a small amount of cadmium doping can potentially be applied in solar cell design and sample the large chemical space by using a variational autoencoder to discover interesting perovskites with band gaps in the ultraviolet and infrared. Additionally, we use CSFE to explore chemical spaces and small doping concentrations beyond those used for training. We further show that CSFE has a mean absolute test error of 7 meV/atom and 0.13 eV for total energies and band gaps for 2D perovskites and discuss its adaptability for exploration of an even wider variety of chemical systems. Density functional theory (DFT) is of interest in modern-day materials discovery. However, DFT is computationally expensive. Here, we develop a new crystal site feature embedding (CSFE) representation that achieves low error in predicting DFT properties and enables predicting properties of chemical families and doping fractions beyond those present in the training datasets. Using CSFE with autoencoders, we present a scheme that enables sampling of large chemical spaces and offers insight into key semiconductor parameters such as band gap. We demonstrate that CSFE works on both 2D and 3D perovskites and identify promising ultraviolet and infrared candidate materials. Here, we report crystal site feature embedding (CSFE), a representation for machine learning of materials that achieves low mean absolute errors for density functional theory band gaps and formation energies. Using CSFE with CNNs and EDNNs, we explored chemical families and doping fractions beyond those present in the training dataset. CSFE allowed us to sample large chemical spaces for materials of interest using autoencoders. We demonstrate the application of the representation by finding perovskite compositions for the ultraviolet and infrared.

Original languageEnglish (US)
Pages (from-to)433-448
Number of pages16
JournalMatter
Volume3
Issue number2
DOIs
StatePublished - Aug 5 2020

Funding

E.H.S. and all co-authors from the Department of Electrical and Computer Engineering at the University of Toronto acknowledge financial support from the Global Research Outreach program of Samsung Advanced Institute of Technology and the Ontario Research Foundation – Research Excellence Program, and by the Natural Sciences and Engineering Research Council of Canada . DFT computations were performed on the Niagara supercomputer at the SciNet HPC Consortium. 54 SciNet is funded by the Canada Foundation for Innovation under the auspices of Compute Canada; the Government of Ontario; Ontario Research Fund – Research Excellence; and the University of Toronto. Training of neural networks was performed on the GPU devices of the SOSCIP (Southern Ontario Smart Computing Innovation Platform, funded by the Federal Economic Development Agency of Southern Ontario , the Province of Ontario, IBM Canada, Ontario Centers of Excellence, Mitacs, and 15 Ontario academic member institutions) GPU accelerate platform. Work at NRC was completed under the auspices of the MCF and AI4D programs.

Keywords

  • MAP3: Understanding
  • auto-encoders
  • convolutional neural networks
  • density functional theory
  • extensive deep neural networks
  • halide perovskites
  • machine learning
  • materials discovery
  • optoelectronic materials
  • photovoltaics

ASJC Scopus subject areas

  • General Materials Science

Fingerprint

Dive into the research topics of 'Crystal Site Feature Embedding Enables Exploration of Large Chemical Spaces'. Together they form a unique fingerprint.

Cite this