TY - JOUR
T1 - A morphological classification model to identify unresolved PanSTARRS1 sources
T2 - Application in the ZTF real-time pipeline
AU - Tachibana, Yutaro
AU - Miller, A. A.
N1 - Funding Information:
Y.T. is funded by JSPS KAKENHI grant Nos. JP16J05742. Y.T. studied as a Global Relay of Observatories Watching Transients Happen (GROWTH) intern at Caltech during the summer and fall of 2017. GROWTH is funded by the National Science Foundation under Partnerships for International Research and Education grant No. 1545949. A.A.M. is funded by the Large Synoptic Survey Telescope Corporation in support of the Data Science Fellowship Program. Based in part on software developed as a part of the Zwicky Transient Facility project. Major funding has been provided by the U.S National Science Foundation under grant No. AST-1440341 and by the ZTF partner institutions: the California Institute of Technology, the Oskar Klein Centre, the Weizmann Institute of Science, the University of Maryland, the University of Washington, Deutsches Elektronen-Synchrotron, the University of Wisconsin-Milwaukee, and the TANGO Program of the University System of Taiwan.
Publisher Copyright:
© 2018. The Astronomical Society of the Pacific. All rights reserved.
PY - 2018
Y1 - 2018
N2 - In the era of large photometric surveys, the importance of automated and accurate classification is rapidly increasing. Specifically, the separation of resolved and unresolved sources in astronomical imaging is a critical initial step for a wide array of studies, ranging from Galactic science to large scale structure and cosmology. Here, we present our method to construct a large, deep catalog of point sources utilizing Pan-STARRS1 (PS1) 3π survey data, which consists of ∼3×109 sources with m≲23.5 mag. We develop a supervised machine-learning methodology, using the random forest (RF) algorithm, to construct the PS1 morphology model. We train the model using ∼5×104 PS1 sources with HST COSMOS morphological classifications and assess its performance using ∼4×106 sources with Sloan Digital Sky Survey (SDSS) spectra and ∼2×108 Gaia sources. We construct 11 “white flux” features, which combine PS1 flux and shape measurements across five filters, to increase the signal-to-noise ratio relative to any individual filter. The RF model is compared to three alternative models, including the SDSS and PS1 photometric classification models, and we find that the RF model performs best. By number the PS1 catalog is dominated by faint sources (m≳21 mag), and in this regime the RF model significantly outperforms the SDSS and PS1 models. For time-domain surveys, identifying unresolved sources is crucial for inferring the Galactic or extragalactic origin of new transients. We have classified ∼1.5×109 sources using the RF model, and these results are used within the Zwicky Transient Facility real-time pipeline to automatically reject stellar sources from the extragalactic alert stream.
AB - In the era of large photometric surveys, the importance of automated and accurate classification is rapidly increasing. Specifically, the separation of resolved and unresolved sources in astronomical imaging is a critical initial step for a wide array of studies, ranging from Galactic science to large scale structure and cosmology. Here, we present our method to construct a large, deep catalog of point sources utilizing Pan-STARRS1 (PS1) 3π survey data, which consists of ∼3×109 sources with m≲23.5 mag. We develop a supervised machine-learning methodology, using the random forest (RF) algorithm, to construct the PS1 morphology model. We train the model using ∼5×104 PS1 sources with HST COSMOS morphological classifications and assess its performance using ∼4×106 sources with Sloan Digital Sky Survey (SDSS) spectra and ∼2×108 Gaia sources. We construct 11 “white flux” features, which combine PS1 flux and shape measurements across five filters, to increase the signal-to-noise ratio relative to any individual filter. The RF model is compared to three alternative models, including the SDSS and PS1 photometric classification models, and we find that the RF model performs best. By number the PS1 catalog is dominated by faint sources (m≳21 mag), and in this regime the RF model significantly outperforms the SDSS and PS1 models. For time-domain surveys, identifying unresolved sources is crucial for inferring the Galactic or extragalactic origin of new transients. We have classified ∼1.5×109 sources using the RF model, and these results are used within the Zwicky Transient Facility real-time pipeline to automatically reject stellar sources from the extragalactic alert stream.
KW - Catalogs
KW - Methods: data analysis
KW - Methods: statistical
UR - http://www.scopus.com/inward/record.url?scp=85057306426&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057306426&partnerID=8YFLogxK
U2 - 10.1088/1538-3873/aae3d9
DO - 10.1088/1538-3873/aae3d9
M3 - Article
AN - SCOPUS:85057306426
SN - 0004-6280
VL - 130
JO - Publications of the Astronomical Society of the Pacific
JF - Publications of the Astronomical Society of the Pacific
IS - 994
M1 - 128001
ER -