A machine learning approach to escaped defect analysis

Detalhes bibliográficos
Autor(a) principal: NEPOMUCENO, Késsia Thais Cavalcanti
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFPE
Texto Completo: https://repositorio.ufpe.br/handle/123456789/48529
Resumo: Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.
id UFPE_34abf5d0e45db478c6e2efc6272932b9
oai_identifier_str oai:repositorio.ufpe.br:123456789/48529
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str 2221
spelling NEPOMUCENO, Késsia Thais Cavalcantihttp://lattes.cnpq.br/1276337923168691http://lattes.cnpq.br/2984888073123287PRUDÊNCIO, Ricardo Bastos Cavalcante2023-01-05T14:02:43Z2023-01-05T14:02:43Z2022-08-11NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.https://repositorio.ufpe.br/handle/123456789/48529Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.FACEPEDefeitos em sistemas ou aplicações computacionais impactam diretamente a qua- lidade e performance de um produto final, gerando consequências para o usuário e o fornecedor. Portanto, identificar o defeito escapado não detectado pelo testador na devida etapa, e incorporado ao produto, torna-se uma das principais atividades na indústria de software. Com o objetivo de mitigar ou eliminar os defeitos escapados empresas costumam ter um setor responsável pela análise e avaliação dos bugs perdidos para entender o con- texto em que eles estão inseridos e corrigir as falhas. Busca-se evitar a sua repetição e ter um ganho na qualidade do produto e performance dos testes. A análise de defeitos escapa- dos também mede o desempenho da equipe de testes, bem como do lançamento de novos produtos e serviços. Entretanto, apesar de ser uma atividade crucial, ela exige recursos como tempo, equipamentos, treinamentos e outros, tornando-se inviável a sua aplicação consistente e precisa. Por isso, em parceria com a Motorola Mobility, construímos um sistema de aprendizagem de máquina para automatizar a análise de defeitos escapados e otimizar o processo manual, diminuindo os recursos investidos nas etapas da análise. A empresa forneceu-nos informações sobre o processo, tais como os dados históricos re- ferentes às últimas análises feitas de forma manual por funcionários da empresa. Deste modo, nosso modelo conta com Bug Reports reais da indústria para dados históricos. Co- letamos, tratamos e utilizamos como entrada para o nosso modelo os dados referentes aos defeitos escapados e não escapados do Bug Report da Motorola e empregamos o Random Forest como classificador principal, resultando no ranking dos Bug Reports com maior probabilidade de se serem um defeito escapado. Para medir o desempenho do classificador, utilizamos a Curva ROC e uma nova métrica que propusemos, a curva de custo-benefício. Em ambas as métricas, obtivemos resultados significativos e promissores. Dito isso, nossas principais contribuições com esse trabalho foram o sistema de análise de efeitos escapados e a métrica curva custo-benefício que utilizamos para medir o desempenho do nosso sis- tema. Logo, os testadores da indústria de software poderão concentrar e direcionar seus esforços nos Bug Reports com maior ou menor probabilidade de se tornarem um defeito escapado, otimizando recursos de operação de trabalho.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessAnálise de defeitos escapadosRankingAutomaçãoAprendizagem de máquinaA machine learning approach to escaped defect analysisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdfDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdfapplication/pdf2206521https://repositorio.ufpe.br/bitstream/123456789/48529/1/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdff0ec1f5497927cfbc73c0bf22f50d649MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/48529/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82362https://repositorio.ufpe.br/bitstream/123456789/48529/3/license.txt5e89a1613ddc8510c6576f4b23a78973MD53TEXTDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.txtDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.txtExtracted texttext/plain121749https://repositorio.ufpe.br/bitstream/123456789/48529/4/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.txt1793177d3c23c1388e78598a5a714f19MD54THUMBNAILDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.jpgDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.jpgGenerated Thumbnailimage/jpeg1204https://repositorio.ufpe.br/bitstream/123456789/48529/5/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.jpgb7e0b0a4c85572eaca97ce8f65f580f9MD55123456789/485292023-01-06 02:23:20.032oai:repositorio.ufpe.br:123456789/48529VGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2l6YcOnw6NvIGRlIERvY3VtZW50b3Mgbm8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRQoKCkRlY2xhcm8gZXN0YXIgY2llbnRlIGRlIHF1ZSBlc3RlIFRlcm1vIGRlIERlcMOzc2l0byBMZWdhbCBlIEF1dG9yaXphw6fDo28gdGVtIG8gb2JqZXRpdm8gZGUgZGl2dWxnYcOnw6NvIGRvcyBkb2N1bWVudG9zIGRlcG9zaXRhZG9zIG5vIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUgZSBkZWNsYXJvIHF1ZToKCkkgLSBvcyBkYWRvcyBwcmVlbmNoaWRvcyBubyBmb3JtdWzDoXJpbyBkZSBkZXDDs3NpdG8gc8OjbyB2ZXJkYWRlaXJvcyBlIGF1dMOqbnRpY29zOwoKSUkgLSAgbyBjb250ZcO6ZG8gZGlzcG9uaWJpbGl6YWRvIMOpIGRlIHJlc3BvbnNhYmlsaWRhZGUgZGUgc3VhIGF1dG9yaWE7CgpJSUkgLSBvIGNvbnRlw7pkbyDDqSBvcmlnaW5hbCwgZSBzZSBvIHRyYWJhbGhvIGUvb3UgcGFsYXZyYXMgZGUgb3V0cmFzIHBlc3NvYXMgZm9yYW0gdXRpbGl6YWRvcywgZXN0YXMgZm9yYW0gZGV2aWRhbWVudGUgcmVjb25oZWNpZGFzOwoKSVYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIG9icmEgY29sZXRpdmEgKG1haXMgZGUgdW0gYXV0b3IpOiB0b2RvcyBvcyBhdXRvcmVzIGVzdMOjbyBjaWVudGVzIGRvIGRlcMOzc2l0byBlIGRlIGFjb3JkbyBjb20gZXN0ZSB0ZXJtbzsKClYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIFRyYWJhbGhvIGRlIENvbmNsdXPDo28gZGUgQ3Vyc28sIERpc3NlcnRhw6fDo28gb3UgVGVzZTogbyBhcnF1aXZvIGRlcG9zaXRhZG8gY29ycmVzcG9uZGUgw6AgdmVyc8OjbyBmaW5hbCBkbyB0cmFiYWxobzsKClZJIC0gcXVhbmRvIHRyYXRhci1zZSBkZSBUcmFiYWxobyBkZSBDb25jbHVzw6NvIGRlIEN1cnNvLCBEaXNzZXJ0YcOnw6NvIG91IFRlc2U6IGVzdG91IGNpZW50ZSBkZSBxdWUgYSBhbHRlcmHDp8OjbyBkYSBtb2RhbGlkYWRlIGRlIGFjZXNzbyBhbyBkb2N1bWVudG8gYXDDs3MgbyBkZXDDs3NpdG8gZSBhbnRlcyBkZSBmaW5kYXIgbyBwZXLDrW9kbyBkZSBlbWJhcmdvLCBxdWFuZG8gZm9yIGVzY29saGlkbyBhY2Vzc28gcmVzdHJpdG8sIHNlcsOhIHBlcm1pdGlkYSBtZWRpYW50ZSBzb2xpY2l0YcOnw6NvIGRvIChhKSBhdXRvciAoYSkgYW8gU2lzdGVtYSBJbnRlZ3JhZG8gZGUgQmlibGlvdGVjYXMgZGEgVUZQRSAoU0lCL1VGUEUpLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gQWJlcnRvOgoKTmEgcXVhbGlkYWRlIGRlIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRlIGF1dG9yIHF1ZSByZWNhZW0gc29icmUgZXN0ZSBkb2N1bWVudG8sIGZ1bmRhbWVudGFkbyBuYSBMZWkgZGUgRGlyZWl0byBBdXRvcmFsIG5vIDkuNjEwLCBkZSAxOSBkZSBmZXZlcmVpcm8gZGUgMTk5OCwgYXJ0LiAyOSwgaW5jaXNvIElJSSwgYXV0b3Jpem8gYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIGEgZGlzcG9uaWJpbGl6YXIgZ3JhdHVpdGFtZW50ZSwgc2VtIHJlc3NhcmNpbWVudG8gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBwYXJhIGZpbnMgZGUgbGVpdHVyYSwgaW1wcmVzc8OjbyBlL291IGRvd25sb2FkIChhcXVpc2nDp8OjbykgYXRyYXbDqXMgZG8gc2l0ZSBkbyBSZXBvc2l0w7NyaW8gRGlnaXRhbCBkYSBVRlBFIG5vIGVuZGVyZcOnbyBodHRwOi8vd3d3LnJlcG9zaXRvcmlvLnVmcGUuYnIsIGEgcGFydGlyIGRhIGRhdGEgZGUgZGVww7NzaXRvLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gUmVzdHJpdG86CgpOYSBxdWFsaWRhZGUgZGUgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGUgYXV0b3IgcXVlIHJlY2FlbSBzb2JyZSBlc3RlIGRvY3VtZW50bywgZnVuZGFtZW50YWRvIG5hIExlaSBkZSBEaXJlaXRvIEF1dG9yYWwgbm8gOS42MTAgZGUgMTkgZGUgZmV2ZXJlaXJvIGRlIDE5OTgsIGFydC4gMjksIGluY2lzbyBJSUksIGF1dG9yaXpvIGEgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgUGVybmFtYnVjbyBhIGRpc3BvbmliaWxpemFyIGdyYXR1aXRhbWVudGUsIHNlbSByZXNzYXJjaW1lbnRvIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgcGFyYSBmaW5zIGRlIGxlaXR1cmEsIGltcHJlc3PDo28gZS9vdSBkb3dubG9hZCAoYXF1aXNpw6fDo28pIGF0cmF2w6lzIGRvIHNpdGUgZG8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRSBubyBlbmRlcmXDp28gaHR0cDovL3d3dy5yZXBvc2l0b3Jpby51ZnBlLmJyLCBxdWFuZG8gZmluZGFyIG8gcGVyw61vZG8gZGUgZW1iYXJnbyBjb25kaXplbnRlIGFvIHRpcG8gZGUgZG9jdW1lbnRvLCBjb25mb3JtZSBpbmRpY2FkbyBubyBjYW1wbyBEYXRhIGRlIEVtYmFyZ28uCg==Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212023-01-06T05:23:20Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv A machine learning approach to escaped defect analysis
title A machine learning approach to escaped defect analysis
spellingShingle A machine learning approach to escaped defect analysis
NEPOMUCENO, Késsia Thais Cavalcanti
Análise de defeitos escapados
Ranking
Automação
Aprendizagem de máquina
title_short A machine learning approach to escaped defect analysis
title_full A machine learning approach to escaped defect analysis
title_fullStr A machine learning approach to escaped defect analysis
title_full_unstemmed A machine learning approach to escaped defect analysis
title_sort A machine learning approach to escaped defect analysis
author NEPOMUCENO, Késsia Thais Cavalcanti
author_facet NEPOMUCENO, Késsia Thais Cavalcanti
author_role author
dc.contributor.authorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/1276337923168691
dc.contributor.advisorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/2984888073123287
dc.contributor.author.fl_str_mv NEPOMUCENO, Késsia Thais Cavalcanti
dc.contributor.advisor1.fl_str_mv PRUDÊNCIO, Ricardo Bastos Cavalcante
contributor_str_mv PRUDÊNCIO, Ricardo Bastos Cavalcante
dc.subject.por.fl_str_mv Análise de defeitos escapados
Ranking
Automação
Aprendizagem de máquina
topic Análise de defeitos escapados
Ranking
Automação
Aprendizagem de máquina
description Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.
publishDate 2022
dc.date.issued.fl_str_mv 2022-08-11
dc.date.accessioned.fl_str_mv 2023-01-05T14:02:43Z
dc.date.available.fl_str_mv 2023-01-05T14:02:43Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/48529
identifier_str_mv NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.
url https://repositorio.ufpe.br/handle/123456789/48529
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/embargoedAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv embargoedAccess
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv Programa de Pos Graduacao em Ciencia da Computacao
dc.publisher.initials.fl_str_mv UFPE
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
bitstream.url.fl_str_mv https://repositorio.ufpe.br/bitstream/123456789/48529/1/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf
https://repositorio.ufpe.br/bitstream/123456789/48529/2/license_rdf
https://repositorio.ufpe.br/bitstream/123456789/48529/3/license.txt
https://repositorio.ufpe.br/bitstream/123456789/48529/4/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.txt
https://repositorio.ufpe.br/bitstream/123456789/48529/5/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.jpg
bitstream.checksum.fl_str_mv f0ec1f5497927cfbc73c0bf22f50d649
e39d27027a6cc9cb039ad269a5db8e34
5e89a1613ddc8510c6576f4b23a78973
1793177d3c23c1388e78598a5a714f19
b7e0b0a4c85572eaca97ce8f65f580f9
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1802310894252720128