CMS distributed computing workflow experience

Jennifer Adelman-Mccarthy*, Oliver Gutsche, Jeffrey D. Haas, Harrison B. Prosper, Valentina Dutta, Guillelmo Gomez-Ceballos, Kristian Hahn, Markus Klute, Ajit Mohapatra, Vincenzo Spinoso, Dorian Kcira, Julien Caudron, Junhui Liao, Arnaud Pin, Nicolas Schul, Gilles De Lentdecker, Joseph McCartin, Lukas Vanelderen, Xavier Janssen, Andrey TsyganovDerek Barge, Andrew Lahiff

*Corresponding author for this work

Research output: Contribution to journalConference article

1 Scopus citations

Abstract

The vast majority of the CMS Computing capacity, which is organized in a tiered hierarchy, is located away from CERN. The 7 Tier-1 sites archive the LHC proton-proton collision data that is initially processed at CERN. These sites provide access to all recorded and simulated data for the Tier-2 sites, via wide-area network (WAN) transfers. All central data processing workflows are executed at the Tier-1 level, which contain re-reconstruction and skimming workflows of collision data as well as reprocessing of simulated data to adapt to changing detector conditions. This paper describes the operation of the CMS processing infrastructure at the Tier-1 level. The Tier-1 workflows are described in detail. The operational optimization of resource usage is described. In particular, the variation of different workflows during the data taking period of 2010, their efficiencies and latencies as well as their impact on the delivery of physics results is discussed and lessons are drawn from this experience. The simulation of proton-proton collisions for the CMS experiment is primarily carried out at the second tier of the CMS computing infrastructure. Half of the Tier-2 sites of CMS are reserved for central Monte Carlo (MC) production while the other half is available for user analysis. This paper summarizes the large throughput of the MC production operation during the data taking period of 2010 and discusses the latencies and efficiencies of the various types of MC production workflows. We present the operational procedures to optimize the usage of available resources and we the operational model of CMS for including opportunistic resources, such as the larger Tier-3 sites, into the central production operation.

Original languageEnglish (US)
Article number72019
JournalJournal of Physics: Conference Series
Volume331
Issue numberPART 7
DOIs
StatePublished - Jan 1 2011
EventInternational Conference on Computing in High Energy and Nuclear Physics, CHEP 2010 - Taipei, Taiwan, Province of China
Duration: Oct 18 2010Oct 22 2010

ASJC Scopus subject areas

  • Physics and Astronomy(all)

Fingerprint Dive into the research topics of 'CMS distributed computing workflow experience'. Together they form a unique fingerprint.

  • Cite this

    Adelman-Mccarthy, J., Gutsche, O., Haas, J. D., Prosper, H. B., Dutta, V., Gomez-Ceballos, G., Hahn, K., Klute, M., Mohapatra, A., Spinoso, V., Kcira, D., Caudron, J., Liao, J., Pin, A., Schul, N., De Lentdecker, G., McCartin, J., Vanelderen, L., Janssen, X., ... Lahiff, A. (2011). CMS distributed computing workflow experience. Journal of Physics: Conference Series, 331(PART 7), [72019]. https://doi.org/10.1088/1742-6596/331/7/072019