Effective HTCondor-based monitoring system for CMS
Autor(a) principal: | |
---|---|
Data de Publicação: | 2017 |
Outros Autores: | , , , , , , , , |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1088/1742-6596/898/9/092039 http://hdl.handle.net/11449/220991 |
Resumo: | The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues. |
id |
UNSP_fd6abfd4d7c6dde1a51eedf85b98250d |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/220991 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Effective HTCondor-based monitoring system for CMSThe CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.California Institute of TechnologyUniversity of Nebraska-LincolnUniversidade Estadual PaulistaCentro de Investigaciones Energeticas Medioambientales y TecnologicasNational Center for PhysicsUniversity of CaliforniaFermi National Accelerator LaboratoryPort d'Informacio CientificaUniversidade Estadual PaulistaCalifornia Institute of TechnologyUniversity of Nebraska-LincolnUniversidade Estadual Paulista (UNESP)Medioambientales y TecnologicasNational Center for PhysicsUniversity of CaliforniaFermi National Accelerator LaboratoryPort d'Informacio CientificaBalcas, J.Bockelman, B. P.Da Silva, J. M. [UNESP]Hernandez, J.Khan, F. A.Letts, J.Mascheroni, M.Mason, D. A.Perez-Calero Yzquierdo, A.Vlimant, J. R.2022-04-28T19:07:13Z2022-04-28T19:07:13Z2017-11-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObjecthttp://dx.doi.org/10.1088/1742-6596/898/9/092039Journal of Physics: Conference Series, v. 898, n. 9, 2017.1742-65961742-6588http://hdl.handle.net/11449/22099110.1088/1742-6596/898/9/0920392-s2.0-85039413164Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengJournal of Physics: Conference Seriesinfo:eu-repo/semantics/openAccess2022-04-28T19:07:13Zoai:repositorio.unesp.br:11449/220991Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462022-04-28T19:07:13Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Effective HTCondor-based monitoring system for CMS |
title |
Effective HTCondor-based monitoring system for CMS |
spellingShingle |
Effective HTCondor-based monitoring system for CMS Balcas, J. |
title_short |
Effective HTCondor-based monitoring system for CMS |
title_full |
Effective HTCondor-based monitoring system for CMS |
title_fullStr |
Effective HTCondor-based monitoring system for CMS |
title_full_unstemmed |
Effective HTCondor-based monitoring system for CMS |
title_sort |
Effective HTCondor-based monitoring system for CMS |
author |
Balcas, J. |
author_facet |
Balcas, J. Bockelman, B. P. Da Silva, J. M. [UNESP] Hernandez, J. Khan, F. A. Letts, J. Mascheroni, M. Mason, D. A. Perez-Calero Yzquierdo, A. Vlimant, J. R. |
author_role |
author |
author2 |
Bockelman, B. P. Da Silva, J. M. [UNESP] Hernandez, J. Khan, F. A. Letts, J. Mascheroni, M. Mason, D. A. Perez-Calero Yzquierdo, A. Vlimant, J. R. |
author2_role |
author author author author author author author author author |
dc.contributor.none.fl_str_mv |
California Institute of Technology University of Nebraska-Lincoln Universidade Estadual Paulista (UNESP) Medioambientales y Tecnologicas National Center for Physics University of California Fermi National Accelerator Laboratory Port d'Informacio Cientifica |
dc.contributor.author.fl_str_mv |
Balcas, J. Bockelman, B. P. Da Silva, J. M. [UNESP] Hernandez, J. Khan, F. A. Letts, J. Mascheroni, M. Mason, D. A. Perez-Calero Yzquierdo, A. Vlimant, J. R. |
description |
The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-11-23 2022-04-28T19:07:13Z 2022-04-28T19:07:13Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1088/1742-6596/898/9/092039 Journal of Physics: Conference Series, v. 898, n. 9, 2017. 1742-6596 1742-6588 http://hdl.handle.net/11449/220991 10.1088/1742-6596/898/9/092039 2-s2.0-85039413164 |
url |
http://dx.doi.org/10.1088/1742-6596/898/9/092039 http://hdl.handle.net/11449/220991 |
identifier_str_mv |
Journal of Physics: Conference Series, v. 898, n. 9, 2017. 1742-6596 1742-6588 10.1088/1742-6596/898/9/092039 2-s2.0-85039413164 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Journal of Physics: Conference Series |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1792962127116369920 |