Crowdsmelling: A preliminary study on using collective knowledge in code smells detection

Reis, J.; Brito e Abreu, F.; Figueiredo Carneiro, G.

Crowdsmelling: A preliminary study on using collective knowledge in code smells detection

Detalhes bibliográficos
Autor(a) principal:	Reis, J.
Data de Publicação:	2022
Outros Autores:	Brito e Abreu, F., Figueiredo Carneiro, G.
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10071/25596
Resumo:	Code smells are seen as a major source of technical debt and, as such, should be detected and removed. However, researchers argue that the subjectiveness of the code smells detection process is a major hindrance to mitigating the problem of smells-infected code. This paper presents the results of a validation experiment for the Crowdsmelling approach proposed earlier. The latter is based on supervised machine learning techniques, where the wisdom of the crowd (of software developers) is used to collectively calibrate code smells detection algorithms, thereby lessening the subjectivity issue. In the context of three consecutive years of a Software Engineering course, a total ``crowd'' of around a hundred teams, with an average of three members each, classified the presence of 3 code smells (Long Method, God Class, and Feature Envy) in Java source code. These classifications were the basis of the oracles used for training six machine learning algorithms. Over one hundred models were generated and evaluated to determine which machine learning algorithms had the best performance in detecting each of the aforementioned code smells. Good performances were obtained for God Class detection (ROC=0.896 for Naive Bayes) and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for Feature Envy (ROC=0.570 for Random Forrest). The results suggest that Crowdsmelling is a feasible approach for the detection of code smells. Further validation experiments based on dynamic learning are required to comprehensive coverage of code smells to increase external validity.

Metadados do item

id	RCAP_c933096c4e36f45a059c311be38023b5
oai_identifier_str	oai:repositorio.iscte-iul.pt:10071/25596
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Crowdsmelling: A preliminary study on using collective knowledge in code smells detectionCrowdsmellingCode smellsCode smells detectionSoftware qualitySoftware maintenanceCollective knowledgeMachine learning algorithmsCode smells are seen as a major source of technical debt and, as such, should be detected and removed. However, researchers argue that the subjectiveness of the code smells detection process is a major hindrance to mitigating the problem of smells-infected code. This paper presents the results of a validation experiment for the Crowdsmelling approach proposed earlier. The latter is based on supervised machine learning techniques, where the wisdom of the crowd (of software developers) is used to collectively calibrate code smells detection algorithms, thereby lessening the subjectivity issue. In the context of three consecutive years of a Software Engineering course, a total ``crowd'' of around a hundred teams, with an average of three members each, classified the presence of 3 code smells (Long Method, God Class, and Feature Envy) in Java source code. These classifications were the basis of the oracles used for training six machine learning algorithms. Over one hundred models were generated and evaluated to determine which machine learning algorithms had the best performance in detecting each of the aforementioned code smells. Good performances were obtained for God Class detection (ROC=0.896 for Naive Bayes) and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for Feature Envy (ROC=0.570 for Random Forrest). The results suggest that Crowdsmelling is a feasible approach for the detection of code smells. Further validation experiments based on dynamic learning are required to comprehensive coverage of code smells to increase external validity.Springer2023-03-17T00:00:00Z2022-01-01T00:00:00Z20222022-06-03T15:20:54Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/25596eng1382-325610.1007/s10664-021-10110-5Reis, J.Brito e Abreu, F.Figueiredo Carneiro, G.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:46:10Zoai:repositorio.iscte-iul.pt:10071/25596Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:22:09.808529Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
title	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
spellingShingle	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection Reis, J. Crowdsmelling Code smells Code smells detection Software quality Software maintenance Collective knowledge Machine learning algorithms
title_short	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
title_full	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
title_fullStr	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
title_full_unstemmed	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
title_sort	Crowdsmelling: A preliminary study on using collective knowledge in code smells detection
author	Reis, J.
author_facet	Reis, J. Brito e Abreu, F. Figueiredo Carneiro, G.
author_role	author
author2	Brito e Abreu, F. Figueiredo Carneiro, G.
author2_role	author author
dc.contributor.author.fl_str_mv	Reis, J. Brito e Abreu, F. Figueiredo Carneiro, G.
dc.subject.por.fl_str_mv	Crowdsmelling Code smells Code smells detection Software quality Software maintenance Collective knowledge Machine learning algorithms
topic	Crowdsmelling Code smells Code smells detection Software quality Software maintenance Collective knowledge Machine learning algorithms
description	Code smells are seen as a major source of technical debt and, as such, should be detected and removed. However, researchers argue that the subjectiveness of the code smells detection process is a major hindrance to mitigating the problem of smells-infected code. This paper presents the results of a validation experiment for the Crowdsmelling approach proposed earlier. The latter is based on supervised machine learning techniques, where the wisdom of the crowd (of software developers) is used to collectively calibrate code smells detection algorithms, thereby lessening the subjectivity issue. In the context of three consecutive years of a Software Engineering course, a total ``crowd'' of around a hundred teams, with an average of three members each, classified the presence of 3 code smells (Long Method, God Class, and Feature Envy) in Java source code. These classifications were the basis of the oracles used for training six machine learning algorithms. Over one hundred models were generated and evaluated to determine which machine learning algorithms had the best performance in detecting each of the aforementioned code smells. Good performances were obtained for God Class detection (ROC=0.896 for Naive Bayes) and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for Feature Envy (ROC=0.570 for Random Forrest). The results suggest that Crowdsmelling is a feasible approach for the detection of code smells. Further validation experiments based on dynamic learning are required to comprehensive coverage of code smells to increase external validity.
publishDate	2022
dc.date.none.fl_str_mv	2022-01-01T00:00:00Z 2022 2022-06-03T15:20:54Z 2023-03-17T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10071/25596
url	http://hdl.handle.net/10071/25596
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	1382-3256 10.1007/s10664-021-10110-5
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Springer
publisher.none.fl_str_mv	Springer
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799134783394021376

Crowdsmelling: A preliminary study on using collective knowledge in code smells detection

Registros relacionados