Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress
Autor(a) principal: | |
---|---|
Data de Publicação: | 2024 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Institucional da UFMG |
Texto Completo: | http://hdl.handle.net/1843/75915 |
Resumo: | Considering that software systems are among the most complex human constructions ever made, it is natural for a variety of errors and inconsistencies to occur. To prevent such issues from reaching end-users and causing harm, testing activities are necessary in software development projects. One of the most common methods is end-to-end testing, which aims to verify the behavior of system requirements as a whole. To implement this type of testing, developers rely on various tools such as Selenium, Cypress, ndPlaywright, among others. Despite the increasing use of these tools, few studies evaluate the bad practices associated with their use. To address this issue, this research investigated the bad practices related to the use of the Cypress framework, a JavaScript framework for end-to-end testing. Initially, a study was conducted to catalog the most common smells in such tests through a Systematic Literature Review (SLR) and a Grey Literature Review (GLR), resulting in the identification of 14 specific smells in end-to-end tests implemented with Cypress. Subsequently, methods for automatically identifying these smells were evaluated. Large Language Models (LLMs), such as ChatGPT, which are used to automate a variety of tasks, including those relevant to software development, were utilized. The ability of ChatGPT to identify these problems was assessed through a case study and a study with GitHub applications. In the controlled study, ChatGPT successfully identified 12 of the 14 cataloged smells. Eight of the smells considered in the study were detected after the first request (67%). The field study evaluated end-to-end tests implemented in three open-source systems: Pigallery2, Livewire, and lobaLeaks. The results showed that the Pigallery2 system had a precision of 0.31 and a recall of 0.62. For Livewire, the values were 0.24 for precision and 0.44 for recall. Finally, GlobaLeaks had the worst performance, with a precision of 0.15 and a recall of 0.31. The main cause for the low precision and recall rates obtained in this second study was due to inefficiency in detecting certain smells, such as Brittle Selectors. The research yielded promising results by integrating an SLR and GLR study, thus de- termining a catalog of smells for tests developed with Cypress. Regarding the detection of smells, it can be concluded that ChatGPT is not efficient in detecting them. |
id |
UFMG_4468661a7297a9c7f5066fe10aba8c6d |
---|---|
oai_identifier_str |
oai:repositorio.ufmg.br:1843/75915 |
network_acronym_str |
UFMG |
network_name_str |
Repositório Institucional da UFMG |
repository_id_str |
|
spelling |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta CypressIdentification of smells in end-to-end tests implemented using the Cypress toolTestes fim-a-fimCode smellsTest smellsModelos de linguagem de grande escalaChatgptCypressJavascriptComputação – TesesEngenharia de software– TesesSoftware – Avaliação - TesesJavaScript (Linguagem de programação de computador) – TesesConsidering that software systems are among the most complex human constructions ever made, it is natural for a variety of errors and inconsistencies to occur. To prevent such issues from reaching end-users and causing harm, testing activities are necessary in software development projects. One of the most common methods is end-to-end testing, which aims to verify the behavior of system requirements as a whole. To implement this type of testing, developers rely on various tools such as Selenium, Cypress, ndPlaywright, among others. Despite the increasing use of these tools, few studies evaluate the bad practices associated with their use. To address this issue, this research investigated the bad practices related to the use of the Cypress framework, a JavaScript framework for end-to-end testing. Initially, a study was conducted to catalog the most common smells in such tests through a Systematic Literature Review (SLR) and a Grey Literature Review (GLR), resulting in the identification of 14 specific smells in end-to-end tests implemented with Cypress. Subsequently, methods for automatically identifying these smells were evaluated. Large Language Models (LLMs), such as ChatGPT, which are used to automate a variety of tasks, including those relevant to software development, were utilized. The ability of ChatGPT to identify these problems was assessed through a case study and a study with GitHub applications. In the controlled study, ChatGPT successfully identified 12 of the 14 cataloged smells. Eight of the smells considered in the study were detected after the first request (67%). The field study evaluated end-to-end tests implemented in three open-source systems: Pigallery2, Livewire, and lobaLeaks. The results showed that the Pigallery2 system had a precision of 0.31 and a recall of 0.62. For Livewire, the values were 0.24 for precision and 0.44 for recall. Finally, GlobaLeaks had the worst performance, with a precision of 0.15 and a recall of 0.31. The main cause for the low precision and recall rates obtained in this second study was due to inefficiency in detecting certain smells, such as Brittle Selectors. The research yielded promising results by integrating an SLR and GLR study, thus de- termining a catalog of smells for tests developed with Cypress. Regarding the detection of smells, it can be concluded that ChatGPT is not efficient in detecting them.Considerando que o sistema é uma das construções humanas mais complexas já realizadas, é natural que uma variedade de erros e inconsistências possam ocorrer. Para evitar que tais problemas cheguem aos usuários finais e causem prejuízos, são necessárias atividades de teste em projetos de desenvolvimento de software. Um dos métodos mais comuns é o teste fim-a-fim, que visa verificar o comportamento dos requisitos do sistema como um todo. Para implementar esse tipo de teste, os desenvolvedores contam com várias ferramentas, como Selenium, Cypress e Playwright, entre outras. Apesar do aumento no uso dessas ferramentas, poucos estudos avaliam más práticas associadas ao seu uso. Para abordar esse assunto, esta pesquisa investigou as más práticas relacionadas com o uso do framework Cypress, um framework JavaScript para testes fim-a-fim. Inicialmente, foi realizado um estudo para catalogar os smells mais comuns em tais testes por meio de uma Revisão Sistemática da Literatura (SLR) e uma Revisão da Literatura Cinza (GLR), resultando na identificação de 14 smells específicos de testes fim-a-fim implementados com o Cypress. Em seguida, avaliou-se métodos para identificar automaticamente esses smells. Para isso, recorreu-se aos Modelos de Linguagem de Grande Escala (LLMs), como o ChatGPT, que são utilizados para automatizar uma variedade de tarefas, incluindo aquelas pertinentes ao desenvolvimento de software. A capacidade do ChatGPT em identificar esses problemas foi avaliada por meio de um estudo de caso e um estudo com aplicações GitHub. No estudo controlado, o ChatGPT conseguiu identificar com sucesso 12 dos 14 smells catalogados. Oito dos smells considerados no estudo foram detectados após a primeira solicitação (67%). O estudo de campo avaliou testes fim-a-fim implementados em três sistemas de código aberto: Pigallery2, Livewire e GlobaLeaks. Os resultados mostraram que o sistema Pigallery2 teve uma precisão de 0.31 e um recall de 0.62. Para o Livewire, os valores foram de 0.24 para precisão e 0.44 para recall. Por fim, o GlobaLeaks apresentou o pior desempenho, com uma precisão de 0.15 e um recall de 0.31. A principal causa para os baixos índices de precisão e recall obtidos nesse segundo estudo foi devido à ineficiência na detecção de certos smells, como o Brittle Selectors. A pesquisa obteve resultados promissores ao integrar um estudo da SLR e GLR, com isso determinando um catálogo de smells para os testes desenvolvidos com o Cypress. Em relação a detecção dos smells pode-se concluir que o ChatGPT não é eficiente para detecção destes.Universidade Federal de Minas GeraisBrasilICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃOPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGMarco Túlio de Oliveira Valentehttp://lattes.cnpq.br/2147157840592913João Eduardo Montadon de Araújo FilhoEduardo Magno Lages FigueiredoAndré Cavalcante HoraLarissa de Cássia Nazaré Bicalho2024-09-03T17:04:26Z2024-09-03T17:04:26Z2024-07-05info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/1843/75915porinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2024-09-03T17:04:27Zoai:repositorio.ufmg.br:1843/75915Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2024-09-03T17:04:27Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
dc.title.none.fl_str_mv |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress Identification of smells in end-to-end tests implemented using the Cypress tool |
title |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress |
spellingShingle |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress Larissa de Cássia Nazaré Bicalho Testes fim-a-fim Code smells Test smells Modelos de linguagem de grande escala Chatgpt Cypress Javascript Computação – Teses Engenharia de software– Teses Software – Avaliação - Teses JavaScript (Linguagem de programação de computador) – Teses |
title_short |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress |
title_full |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress |
title_fullStr |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress |
title_full_unstemmed |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress |
title_sort |
Identificação de smells em testes fim-a-fim implementados usando a ferramenta Cypress |
author |
Larissa de Cássia Nazaré Bicalho |
author_facet |
Larissa de Cássia Nazaré Bicalho |
author_role |
author |
dc.contributor.none.fl_str_mv |
Marco Túlio de Oliveira Valente http://lattes.cnpq.br/2147157840592913 João Eduardo Montadon de Araújo Filho Eduardo Magno Lages Figueiredo André Cavalcante Hora |
dc.contributor.author.fl_str_mv |
Larissa de Cássia Nazaré Bicalho |
dc.subject.por.fl_str_mv |
Testes fim-a-fim Code smells Test smells Modelos de linguagem de grande escala Chatgpt Cypress Javascript Computação – Teses Engenharia de software– Teses Software – Avaliação - Teses JavaScript (Linguagem de programação de computador) – Teses |
topic |
Testes fim-a-fim Code smells Test smells Modelos de linguagem de grande escala Chatgpt Cypress Javascript Computação – Teses Engenharia de software– Teses Software – Avaliação - Teses JavaScript (Linguagem de programação de computador) – Teses |
description |
Considering that software systems are among the most complex human constructions ever made, it is natural for a variety of errors and inconsistencies to occur. To prevent such issues from reaching end-users and causing harm, testing activities are necessary in software development projects. One of the most common methods is end-to-end testing, which aims to verify the behavior of system requirements as a whole. To implement this type of testing, developers rely on various tools such as Selenium, Cypress, ndPlaywright, among others. Despite the increasing use of these tools, few studies evaluate the bad practices associated with their use. To address this issue, this research investigated the bad practices related to the use of the Cypress framework, a JavaScript framework for end-to-end testing. Initially, a study was conducted to catalog the most common smells in such tests through a Systematic Literature Review (SLR) and a Grey Literature Review (GLR), resulting in the identification of 14 specific smells in end-to-end tests implemented with Cypress. Subsequently, methods for automatically identifying these smells were evaluated. Large Language Models (LLMs), such as ChatGPT, which are used to automate a variety of tasks, including those relevant to software development, were utilized. The ability of ChatGPT to identify these problems was assessed through a case study and a study with GitHub applications. In the controlled study, ChatGPT successfully identified 12 of the 14 cataloged smells. Eight of the smells considered in the study were detected after the first request (67%). The field study evaluated end-to-end tests implemented in three open-source systems: Pigallery2, Livewire, and lobaLeaks. The results showed that the Pigallery2 system had a precision of 0.31 and a recall of 0.62. For Livewire, the values were 0.24 for precision and 0.44 for recall. Finally, GlobaLeaks had the worst performance, with a precision of 0.15 and a recall of 0.31. The main cause for the low precision and recall rates obtained in this second study was due to inefficiency in detecting certain smells, such as Brittle Selectors. The research yielded promising results by integrating an SLR and GLR study, thus de- termining a catalog of smells for tests developed with Cypress. Regarding the detection of smells, it can be concluded that ChatGPT is not efficient in detecting them. |
publishDate |
2024 |
dc.date.none.fl_str_mv |
2024-09-03T17:04:26Z 2024-09-03T17:04:26Z 2024-07-05 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1843/75915 |
url |
http://hdl.handle.net/1843/75915 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais Brasil ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO Programa de Pós-Graduação em Ciência da Computação UFMG |
publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais Brasil ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO Programa de Pós-Graduação em Ciência da Computação UFMG |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
instname_str |
Universidade Federal de Minas Gerais (UFMG) |
instacron_str |
UFMG |
institution |
UFMG |
reponame_str |
Repositório Institucional da UFMG |
collection |
Repositório Institucional da UFMG |
repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
repository.mail.fl_str_mv |
repositorio@ufmg.br |
_version_ |
1816829816480989184 |