Empirically supported similarity coefficients for the identification of refactoring opportunities

Detalhes bibliográficos
Autor(a) principal: Pinto, Arthur Ferreira
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFLA
Texto Completo: http://repositorio.ufla.br/jspui/handle/1/29596
Resumo: Code refactoring is defined as the process of changing a software system preserving the external behavior of the code, but improving its internal structure. Through refactoring, it becomes possible to treat code architecture symptoms, known as Code Smells, which can affect features such as portability, reusability, maintainability, and scalability. Several techniques to identify refactoring opportunities rely on similarity coefficients to find misplaced entities on the system architecture, as well as to determine where it should be located. As an example, we expect that a method is located in a class whose other methods are structurally similar to it. However, the existing coefficients in literature have not been designed for the structural analysis of software systems, which may not guarantee satisfactory precision. This master dissertation, therefore, proposes three new coefficients—PT MC, PT MM, and PT EM—to improve the precision of the identification of Move Class, Move Method, and Extract Method refactoring opportunities, respectively. Our main objectives are: (i) to propose more effective similarity coefficients for object-oriented systems, in order to locate more accurately entities improperly positioned on a system architecture and (ii) to leverage the precision of tools for identification of refactoring opportunities based on structural similarity through the application of the proposed coefficients. Firstly, we investigated the precision of 18 similarity coefficients in 10 systems of Qualitas.class Corpus (training set) to select the most appropriate coefficient to be adapted. Then, we adapted the selected coefficient through an empirical experiment based on a treatment combination with replication over genetic algorithms in order to generate the proposed coefficients. Finally, we implemented AIRP, a tool that relies on the proposed coefficients to identify refactoring opportunities. In order to evaluate the proposed coefficients, we compared them with other 18 coefficients in other 101 systems of Qualitas.class Corpus (test set). The results indicate, in relation to the best analyzed coefficient, a statistical improvement from 5.23% to 6.81% for the identification of Move Class refactoring opportunities, 12.33% to 14.79% for Move Method, and 0.25% to 0.40% for Extract Method.
id UFLA_94748cd5b22a6a104c34421dcbbcc23d
oai_identifier_str oai:localhost:1/29596
network_acronym_str UFLA
network_name_str Repositório Institucional da UFLA
repository_id_str
spelling Empirically supported similarity coefficients for the identification of refactoring opportunitiesCoeficientes de similaridade para a identificação de oportunidades de refatoração empiricamente com base empíricaArquitetura de softwareSimilaridade estruturalRefatoração de códigoMove classMove methodExtract methodSoftware architectureStructural similarityCode refactoringCiência da ComputaçãoCode refactoring is defined as the process of changing a software system preserving the external behavior of the code, but improving its internal structure. Through refactoring, it becomes possible to treat code architecture symptoms, known as Code Smells, which can affect features such as portability, reusability, maintainability, and scalability. Several techniques to identify refactoring opportunities rely on similarity coefficients to find misplaced entities on the system architecture, as well as to determine where it should be located. As an example, we expect that a method is located in a class whose other methods are structurally similar to it. However, the existing coefficients in literature have not been designed for the structural analysis of software systems, which may not guarantee satisfactory precision. This master dissertation, therefore, proposes three new coefficients—PT MC, PT MM, and PT EM—to improve the precision of the identification of Move Class, Move Method, and Extract Method refactoring opportunities, respectively. Our main objectives are: (i) to propose more effective similarity coefficients for object-oriented systems, in order to locate more accurately entities improperly positioned on a system architecture and (ii) to leverage the precision of tools for identification of refactoring opportunities based on structural similarity through the application of the proposed coefficients. Firstly, we investigated the precision of 18 similarity coefficients in 10 systems of Qualitas.class Corpus (training set) to select the most appropriate coefficient to be adapted. Then, we adapted the selected coefficient through an empirical experiment based on a treatment combination with replication over genetic algorithms in order to generate the proposed coefficients. Finally, we implemented AIRP, a tool that relies on the proposed coefficients to identify refactoring opportunities. In order to evaluate the proposed coefficients, we compared them with other 18 coefficients in other 101 systems of Qualitas.class Corpus (test set). The results indicate, in relation to the best analyzed coefficient, a statistical improvement from 5.23% to 6.81% for the identification of Move Class refactoring opportunities, 12.33% to 14.79% for Move Method, and 0.25% to 0.40% for Extract Method.Refatoração de código é definido como o processo de alteração de um sistema de software, preservando seu comportamento externo, mas melhorando sua estrutura interna. Por meio da refatoração, torna-se possível tratar os sintomas da arquitetura de código, conhecidos como Code Smells, que podem afetar características como portabilidade, reusabilidade, manutenibilidade e escalabilidade. Várias técnicas para identificar oportunidades de refatoração usam coeficientes de similaridade para encontrar entidades mal posicionadas na arquitetura do sistema, assim como determinar onde deveria estar localizada. Como exemplo, espera-se que um método esteja localizado em uma classe cujos outros métodos sejam estruturalmente semelhantes a ele. No entanto, os coeficientes existentes na literatura não foram projetados para a análise estrutural de sistemas de software, o que pode não garantir uma precisão satisfatória. Portanto, esta dissertação de mestrado propõe três novos coeficientes – PT MC, PT MM e PT EM – para melhorar a precisão da identificação de oportunidades de refatoração para as operações de Move Class, Move Method e Extract Method, respectivamente. Os principais objetivos são: (i) propor coeficientes de similaridade mais efetivos para sistemas orientados a objetos, para localizar com mais precisão entidades mal posicionadas em uma arquitetura de sistema e (ii) alavancar a precisão de ferramentas para identificação de oportunidades de refatoração baseadas na similaridade estrutural por meio da aplicação dos coeficientes propostos. Primeiramente, foi investigada a precisão de 18 coeficientes de similaridade em 10 sistemas da base Qualitas.class Corpus (training set) e, com base nos resultados, foi selecionado o coeficiente mais adequado para ser adaptado. Em seguida, foi adaptado o coeficiente selecionado por meio de um experimento empírico composto de uma combinação de tratamentos com replicação sobre algoritmos genéticos para gerar os coeficientes propostos. Por fim, foi implementada AIRP, uma ferramenta que implementa os coeficientes propostos para identificar oportunidades de refatoração. Para avaliar os coeficientes propostos, foram comparados tais coeficientes com outros 18 coeficientes em outros 101 sistemas da base Qualitas.class Corpus (test set). Os resultados indicam, em relação ao melhor coeficiente analisado, uma melhora estatística de 5,23% a 6,81% para a identificação de oportunidades de refatoração Move Class, 12,33% a 14,79% para Move Method e 0,25 % a 0,40% para Extract Method.Universidade Federal de LavrasPrograma de Pós-Graduação em Ciência da ComputaçãoUFLAbrasilDepartamento de Ciência da ComputaçãoVillela, Ricardo Terra Nunes BuenoValente, Marco Túlio de OliveiraResende, Antônio Maria Pereira dePinto, Arthur Ferreira2018-07-11T11:53:02Z2018-07-11T11:53:02Z2018-07-102018-05-07info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfPINTO, A. F. Empirically supported similarity coefficients for the identification of refactoring opportunities. 2018. 75 p. Dissertação (Mestrado em Ciência da Computação)-Universidade Federal de Lavras, Lavras, 2018.http://repositorio.ufla.br/jspui/handle/1/29596enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFLAinstname:Universidade Federal de Lavras (UFLA)instacron:UFLA2023-04-13T17:16:55Zoai:localhost:1/29596Repositório InstitucionalPUBhttp://repositorio.ufla.br/oai/requestnivaldo@ufla.br || repositorio.biblioteca@ufla.bropendoar:2023-04-13T17:16:55Repositório Institucional da UFLA - Universidade Federal de Lavras (UFLA)false
dc.title.none.fl_str_mv Empirically supported similarity coefficients for the identification of refactoring opportunities
Coeficientes de similaridade para a identificação de oportunidades de refatoração empiricamente com base empírica
title Empirically supported similarity coefficients for the identification of refactoring opportunities
spellingShingle Empirically supported similarity coefficients for the identification of refactoring opportunities
Pinto, Arthur Ferreira
Arquitetura de software
Similaridade estrutural
Refatoração de código
Move class
Move method
Extract method
Software architecture
Structural similarity
Code refactoring
Ciência da Computação
title_short Empirically supported similarity coefficients for the identification of refactoring opportunities
title_full Empirically supported similarity coefficients for the identification of refactoring opportunities
title_fullStr Empirically supported similarity coefficients for the identification of refactoring opportunities
title_full_unstemmed Empirically supported similarity coefficients for the identification of refactoring opportunities
title_sort Empirically supported similarity coefficients for the identification of refactoring opportunities
author Pinto, Arthur Ferreira
author_facet Pinto, Arthur Ferreira
author_role author
dc.contributor.none.fl_str_mv Villela, Ricardo Terra Nunes Bueno
Valente, Marco Túlio de Oliveira
Resende, Antônio Maria Pereira de
dc.contributor.author.fl_str_mv Pinto, Arthur Ferreira
dc.subject.por.fl_str_mv Arquitetura de software
Similaridade estrutural
Refatoração de código
Move class
Move method
Extract method
Software architecture
Structural similarity
Code refactoring
Ciência da Computação
topic Arquitetura de software
Similaridade estrutural
Refatoração de código
Move class
Move method
Extract method
Software architecture
Structural similarity
Code refactoring
Ciência da Computação
description Code refactoring is defined as the process of changing a software system preserving the external behavior of the code, but improving its internal structure. Through refactoring, it becomes possible to treat code architecture symptoms, known as Code Smells, which can affect features such as portability, reusability, maintainability, and scalability. Several techniques to identify refactoring opportunities rely on similarity coefficients to find misplaced entities on the system architecture, as well as to determine where it should be located. As an example, we expect that a method is located in a class whose other methods are structurally similar to it. However, the existing coefficients in literature have not been designed for the structural analysis of software systems, which may not guarantee satisfactory precision. This master dissertation, therefore, proposes three new coefficients—PT MC, PT MM, and PT EM—to improve the precision of the identification of Move Class, Move Method, and Extract Method refactoring opportunities, respectively. Our main objectives are: (i) to propose more effective similarity coefficients for object-oriented systems, in order to locate more accurately entities improperly positioned on a system architecture and (ii) to leverage the precision of tools for identification of refactoring opportunities based on structural similarity through the application of the proposed coefficients. Firstly, we investigated the precision of 18 similarity coefficients in 10 systems of Qualitas.class Corpus (training set) to select the most appropriate coefficient to be adapted. Then, we adapted the selected coefficient through an empirical experiment based on a treatment combination with replication over genetic algorithms in order to generate the proposed coefficients. Finally, we implemented AIRP, a tool that relies on the proposed coefficients to identify refactoring opportunities. In order to evaluate the proposed coefficients, we compared them with other 18 coefficients in other 101 systems of Qualitas.class Corpus (test set). The results indicate, in relation to the best analyzed coefficient, a statistical improvement from 5.23% to 6.81% for the identification of Move Class refactoring opportunities, 12.33% to 14.79% for Move Method, and 0.25% to 0.40% for Extract Method.
publishDate 2018
dc.date.none.fl_str_mv 2018-07-11T11:53:02Z
2018-07-11T11:53:02Z
2018-07-10
2018-05-07
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv PINTO, A. F. Empirically supported similarity coefficients for the identification of refactoring opportunities. 2018. 75 p. Dissertação (Mestrado em Ciência da Computação)-Universidade Federal de Lavras, Lavras, 2018.
http://repositorio.ufla.br/jspui/handle/1/29596
identifier_str_mv PINTO, A. F. Empirically supported similarity coefficients for the identification of refactoring opportunities. 2018. 75 p. Dissertação (Mestrado em Ciência da Computação)-Universidade Federal de Lavras, Lavras, 2018.
url http://repositorio.ufla.br/jspui/handle/1/29596
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Lavras
Programa de Pós-Graduação em Ciência da Computação
UFLA
brasil
Departamento de Ciência da Computação
publisher.none.fl_str_mv Universidade Federal de Lavras
Programa de Pós-Graduação em Ciência da Computação
UFLA
brasil
Departamento de Ciência da Computação
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFLA
instname:Universidade Federal de Lavras (UFLA)
instacron:UFLA
instname_str Universidade Federal de Lavras (UFLA)
instacron_str UFLA
institution UFLA
reponame_str Repositório Institucional da UFLA
collection Repositório Institucional da UFLA
repository.name.fl_str_mv Repositório Institucional da UFLA - Universidade Federal de Lavras (UFLA)
repository.mail.fl_str_mv nivaldo@ufla.br || repositorio.biblioteca@ufla.br
_version_ 1807835217630593024