Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening

Arindam Paul, DIpendra Jha, Reda Al-Bahrani, Wei Keng Liao, Alok Choudhary, Ankit Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very useful in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on data derived from calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing the molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values and leverage the potential of transfer learning from a sizeable DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from the use of transfer learning and leveraging both molecular representations.

Original languageEnglish (US)
Title of host publication2019 International Joint Conference on Neural Networks, IJCNN 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728119854
DOIs
StatePublished - Jul 2019
Event2019 International Joint Conference on Neural Networks, IJCNN 2019 - Budapest, Hungary
Duration: Jul 14 2019Jul 19 2019

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2019-July

Conference

Conference2019 International Joint Conference on Neural Networks, IJCNN 2019
CountryHungary
CityBudapest
Period7/14/197/19/19

Fingerprint

Molecular orbitals
Density functional theory
Screening
Neural networks
Molecular structure
Molecules
Learning systems
Solar cells
Chemical compounds
Photovoltaic cells
Protonation
Network architecture
Conversion efficiency
Organic solar cells

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Paul, A., Jha, DI., Al-Bahrani, R., Liao, W. K., Choudhary, A., & Agrawal, A. (2019). Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening. In 2019 International Joint Conference on Neural Networks, IJCNN 2019 [8852446] (Proceedings of the International Joint Conference on Neural Networks; Vol. 2019-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IJCNN.2019.8852446
Paul, Arindam ; Jha, DIpendra ; Al-Bahrani, Reda ; Liao, Wei Keng ; Choudhary, Alok ; Agrawal, Ankit. / Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening. 2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc., 2019. (Proceedings of the International Joint Conference on Neural Networks).
@inproceedings{4fe57753fb4048a38524c401aadf88b1,
title = "Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening",
abstract = "Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very useful in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on data derived from calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing the molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values and leverage the potential of transfer learning from a sizeable DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from the use of transfer learning and leveraging both molecular representations.",
author = "Arindam Paul and DIpendra Jha and Reda Al-Bahrani and Liao, {Wei Keng} and Alok Choudhary and Ankit Agrawal",
year = "2019",
month = "7",
doi = "10.1109/IJCNN.2019.8852446",
language = "English (US)",
series = "Proceedings of the International Joint Conference on Neural Networks",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2019 International Joint Conference on Neural Networks, IJCNN 2019",
address = "United States",

}

Paul, A, Jha, DI, Al-Bahrani, R, Liao, WK, Choudhary, A & Agrawal, A 2019, Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening. in 2019 International Joint Conference on Neural Networks, IJCNN 2019., 8852446, Proceedings of the International Joint Conference on Neural Networks, vol. 2019-July, Institute of Electrical and Electronics Engineers Inc., 2019 International Joint Conference on Neural Networks, IJCNN 2019, Budapest, Hungary, 7/14/19. https://doi.org/10.1109/IJCNN.2019.8852446

Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening. / Paul, Arindam; Jha, DIpendra; Al-Bahrani, Reda; Liao, Wei Keng; Choudhary, Alok; Agrawal, Ankit.

2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc., 2019. 8852446 (Proceedings of the International Joint Conference on Neural Networks; Vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening

AU - Paul, Arindam

AU - Jha, DIpendra

AU - Al-Bahrani, Reda

AU - Liao, Wei Keng

AU - Choudhary, Alok

AU - Agrawal, Ankit

PY - 2019/7

Y1 - 2019/7

N2 - Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very useful in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on data derived from calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing the molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values and leverage the potential of transfer learning from a sizeable DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from the use of transfer learning and leveraging both molecular representations.

AB - Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very useful in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on data derived from calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing the molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values and leverage the potential of transfer learning from a sizeable DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from the use of transfer learning and leveraging both molecular representations.

UR - http://www.scopus.com/inward/record.url?scp=85073264937&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073264937&partnerID=8YFLogxK

U2 - 10.1109/IJCNN.2019.8852446

DO - 10.1109/IJCNN.2019.8852446

M3 - Conference contribution

AN - SCOPUS:85073264937

T3 - Proceedings of the International Joint Conference on Neural Networks

BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Paul A, Jha DI, Al-Bahrani R, Liao WK, Choudhary A, Agrawal A. Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening. In 2019 International Joint Conference on Neural Networks, IJCNN 2019. Institute of Electrical and Electronics Engineers Inc. 2019. 8852446. (Proceedings of the International Joint Conference on Neural Networks). https://doi.org/10.1109/IJCNN.2019.8852446