Effective HTCondor-based monitoring system for CMS

Detalhes bibliográficos
Autor(a) principal: Balcas, J.
Data de Publicação: 2017
Outros Autores: Bockelman, B. P., Da Silva, J. M. [UNESP], Hernandez, J., Khan, F. A., Letts, J., Mascheroni, M., Mason, D. A., Perez-Calero Yzquierdo, A., Vlimant, J. R.
Tipo de documento: Artigo de conferência
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1088/1742-6596/898/9/092039
http://hdl.handle.net/11449/220991
Resumo: The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.
id UNSP_fd6abfd4d7c6dde1a51eedf85b98250d
oai_identifier_str oai:repositorio.unesp.br:11449/220991
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Effective HTCondor-based monitoring system for CMSThe CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.California Institute of TechnologyUniversity of Nebraska-LincolnUniversidade Estadual PaulistaCentro de Investigaciones Energeticas Medioambientales y TecnologicasNational Center for PhysicsUniversity of CaliforniaFermi National Accelerator LaboratoryPort d'Informacio CientificaUniversidade Estadual PaulistaCalifornia Institute of TechnologyUniversity of Nebraska-LincolnUniversidade Estadual Paulista (UNESP)Medioambientales y TecnologicasNational Center for PhysicsUniversity of CaliforniaFermi National Accelerator LaboratoryPort d'Informacio CientificaBalcas, J.Bockelman, B. P.Da Silva, J. M. [UNESP]Hernandez, J.Khan, F. A.Letts, J.Mascheroni, M.Mason, D. A.Perez-Calero Yzquierdo, A.Vlimant, J. R.2022-04-28T19:07:13Z2022-04-28T19:07:13Z2017-11-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObjecthttp://dx.doi.org/10.1088/1742-6596/898/9/092039Journal of Physics: Conference Series, v. 898, n. 9, 2017.1742-65961742-6588http://hdl.handle.net/11449/22099110.1088/1742-6596/898/9/0920392-s2.0-85039413164Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengJournal of Physics: Conference Seriesinfo:eu-repo/semantics/openAccess2022-04-28T19:07:13Zoai:repositorio.unesp.br:11449/220991Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462022-04-28T19:07:13Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Effective HTCondor-based monitoring system for CMS
title Effective HTCondor-based monitoring system for CMS
spellingShingle Effective HTCondor-based monitoring system for CMS
Balcas, J.
title_short Effective HTCondor-based monitoring system for CMS
title_full Effective HTCondor-based monitoring system for CMS
title_fullStr Effective HTCondor-based monitoring system for CMS
title_full_unstemmed Effective HTCondor-based monitoring system for CMS
title_sort Effective HTCondor-based monitoring system for CMS
author Balcas, J.
author_facet Balcas, J.
Bockelman, B. P.
Da Silva, J. M. [UNESP]
Hernandez, J.
Khan, F. A.
Letts, J.
Mascheroni, M.
Mason, D. A.
Perez-Calero Yzquierdo, A.
Vlimant, J. R.
author_role author
author2 Bockelman, B. P.
Da Silva, J. M. [UNESP]
Hernandez, J.
Khan, F. A.
Letts, J.
Mascheroni, M.
Mason, D. A.
Perez-Calero Yzquierdo, A.
Vlimant, J. R.
author2_role author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv California Institute of Technology
University of Nebraska-Lincoln
Universidade Estadual Paulista (UNESP)
Medioambientales y Tecnologicas
National Center for Physics
University of California
Fermi National Accelerator Laboratory
Port d'Informacio Cientifica
dc.contributor.author.fl_str_mv Balcas, J.
Bockelman, B. P.
Da Silva, J. M. [UNESP]
Hernandez, J.
Khan, F. A.
Letts, J.
Mascheroni, M.
Mason, D. A.
Perez-Calero Yzquierdo, A.
Vlimant, J. R.
description The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.
publishDate 2017
dc.date.none.fl_str_mv 2017-11-23
2022-04-28T19:07:13Z
2022-04-28T19:07:13Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1088/1742-6596/898/9/092039
Journal of Physics: Conference Series, v. 898, n. 9, 2017.
1742-6596
1742-6588
http://hdl.handle.net/11449/220991
10.1088/1742-6596/898/9/092039
2-s2.0-85039413164
url http://dx.doi.org/10.1088/1742-6596/898/9/092039
http://hdl.handle.net/11449/220991
identifier_str_mv Journal of Physics: Conference Series, v. 898, n. 9, 2017.
1742-6596
1742-6588
10.1088/1742-6596/898/9/092039
2-s2.0-85039413164
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Journal of Physics: Conference Series
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1792962127116369920