A machine learning approach to escaped defect analysis
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFPE |
Texto Completo: | https://repositorio.ufpe.br/handle/123456789/48529 |
Resumo: | Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources. |
id |
UFPE_34abf5d0e45db478c6e2efc6272932b9 |
---|---|
oai_identifier_str |
oai:repositorio.ufpe.br:123456789/48529 |
network_acronym_str |
UFPE |
network_name_str |
Repositório Institucional da UFPE |
repository_id_str |
2221 |
spelling |
NEPOMUCENO, Késsia Thais Cavalcantihttp://lattes.cnpq.br/1276337923168691http://lattes.cnpq.br/2984888073123287PRUDÊNCIO, Ricardo Bastos Cavalcante2023-01-05T14:02:43Z2023-01-05T14:02:43Z2022-08-11NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.https://repositorio.ufpe.br/handle/123456789/48529Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.FACEPEDefeitos em sistemas ou aplicações computacionais impactam diretamente a qua- lidade e performance de um produto final, gerando consequências para o usuário e o fornecedor. Portanto, identificar o defeito escapado não detectado pelo testador na devida etapa, e incorporado ao produto, torna-se uma das principais atividades na indústria de software. Com o objetivo de mitigar ou eliminar os defeitos escapados empresas costumam ter um setor responsável pela análise e avaliação dos bugs perdidos para entender o con- texto em que eles estão inseridos e corrigir as falhas. Busca-se evitar a sua repetição e ter um ganho na qualidade do produto e performance dos testes. A análise de defeitos escapa- dos também mede o desempenho da equipe de testes, bem como do lançamento de novos produtos e serviços. Entretanto, apesar de ser uma atividade crucial, ela exige recursos como tempo, equipamentos, treinamentos e outros, tornando-se inviável a sua aplicação consistente e precisa. Por isso, em parceria com a Motorola Mobility, construímos um sistema de aprendizagem de máquina para automatizar a análise de defeitos escapados e otimizar o processo manual, diminuindo os recursos investidos nas etapas da análise. A empresa forneceu-nos informações sobre o processo, tais como os dados históricos re- ferentes às últimas análises feitas de forma manual por funcionários da empresa. Deste modo, nosso modelo conta com Bug Reports reais da indústria para dados históricos. Co- letamos, tratamos e utilizamos como entrada para o nosso modelo os dados referentes aos defeitos escapados e não escapados do Bug Report da Motorola e empregamos o Random Forest como classificador principal, resultando no ranking dos Bug Reports com maior probabilidade de se serem um defeito escapado. Para medir o desempenho do classificador, utilizamos a Curva ROC e uma nova métrica que propusemos, a curva de custo-benefício. Em ambas as métricas, obtivemos resultados significativos e promissores. Dito isso, nossas principais contribuições com esse trabalho foram o sistema de análise de efeitos escapados e a métrica curva custo-benefício que utilizamos para medir o desempenho do nosso sis- tema. Logo, os testadores da indústria de software poderão concentrar e direcionar seus esforços nos Bug Reports com maior ou menor probabilidade de se tornarem um defeito escapado, otimizando recursos de operação de trabalho.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessAnálise de defeitos escapadosRankingAutomaçãoAprendizagem de máquinaA machine learning approach to escaped defect analysisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdfDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdfapplication/pdf2206521https://repositorio.ufpe.br/bitstream/123456789/48529/1/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdff0ec1f5497927cfbc73c0bf22f50d649MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/48529/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82362https://repositorio.ufpe.br/bitstream/123456789/48529/3/license.txt5e89a1613ddc8510c6576f4b23a78973MD53TEXTDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.txtDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.txtExtracted texttext/plain121749https://repositorio.ufpe.br/bitstream/123456789/48529/4/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.txt1793177d3c23c1388e78598a5a714f19MD54THUMBNAILDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.jpgDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.jpgGenerated Thumbnailimage/jpeg1204https://repositorio.ufpe.br/bitstream/123456789/48529/5/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.jpgb7e0b0a4c85572eaca97ce8f65f580f9MD55123456789/485292023-01-06 02:23:20.032oai:repositorio.ufpe.br:123456789/48529VGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2l6YcOnw6NvIGRlIERvY3VtZW50b3Mgbm8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRQoKCkRlY2xhcm8gZXN0YXIgY2llbnRlIGRlIHF1ZSBlc3RlIFRlcm1vIGRlIERlcMOzc2l0byBMZWdhbCBlIEF1dG9yaXphw6fDo28gdGVtIG8gb2JqZXRpdm8gZGUgZGl2dWxnYcOnw6NvIGRvcyBkb2N1bWVudG9zIGRlcG9zaXRhZG9zIG5vIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUgZSBkZWNsYXJvIHF1ZToKCkkgLSBvcyBkYWRvcyBwcmVlbmNoaWRvcyBubyBmb3JtdWzDoXJpbyBkZSBkZXDDs3NpdG8gc8OjbyB2ZXJkYWRlaXJvcyBlIGF1dMOqbnRpY29zOwoKSUkgLSAgbyBjb250ZcO6ZG8gZGlzcG9uaWJpbGl6YWRvIMOpIGRlIHJlc3BvbnNhYmlsaWRhZGUgZGUgc3VhIGF1dG9yaWE7CgpJSUkgLSBvIGNvbnRlw7pkbyDDqSBvcmlnaW5hbCwgZSBzZSBvIHRyYWJhbGhvIGUvb3UgcGFsYXZyYXMgZGUgb3V0cmFzIHBlc3NvYXMgZm9yYW0gdXRpbGl6YWRvcywgZXN0YXMgZm9yYW0gZGV2aWRhbWVudGUgcmVjb25oZWNpZGFzOwoKSVYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIG9icmEgY29sZXRpdmEgKG1haXMgZGUgdW0gYXV0b3IpOiB0b2RvcyBvcyBhdXRvcmVzIGVzdMOjbyBjaWVudGVzIGRvIGRlcMOzc2l0byBlIGRlIGFjb3JkbyBjb20gZXN0ZSB0ZXJtbzsKClYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIFRyYWJhbGhvIGRlIENvbmNsdXPDo28gZGUgQ3Vyc28sIERpc3NlcnRhw6fDo28gb3UgVGVzZTogbyBhcnF1aXZvIGRlcG9zaXRhZG8gY29ycmVzcG9uZGUgw6AgdmVyc8OjbyBmaW5hbCBkbyB0cmFiYWxobzsKClZJIC0gcXVhbmRvIHRyYXRhci1zZSBkZSBUcmFiYWxobyBkZSBDb25jbHVzw6NvIGRlIEN1cnNvLCBEaXNzZXJ0YcOnw6NvIG91IFRlc2U6IGVzdG91IGNpZW50ZSBkZSBxdWUgYSBhbHRlcmHDp8OjbyBkYSBtb2RhbGlkYWRlIGRlIGFjZXNzbyBhbyBkb2N1bWVudG8gYXDDs3MgbyBkZXDDs3NpdG8gZSBhbnRlcyBkZSBmaW5kYXIgbyBwZXLDrW9kbyBkZSBlbWJhcmdvLCBxdWFuZG8gZm9yIGVzY29saGlkbyBhY2Vzc28gcmVzdHJpdG8sIHNlcsOhIHBlcm1pdGlkYSBtZWRpYW50ZSBzb2xpY2l0YcOnw6NvIGRvIChhKSBhdXRvciAoYSkgYW8gU2lzdGVtYSBJbnRlZ3JhZG8gZGUgQmlibGlvdGVjYXMgZGEgVUZQRSAoU0lCL1VGUEUpLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gQWJlcnRvOgoKTmEgcXVhbGlkYWRlIGRlIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRlIGF1dG9yIHF1ZSByZWNhZW0gc29icmUgZXN0ZSBkb2N1bWVudG8sIGZ1bmRhbWVudGFkbyBuYSBMZWkgZGUgRGlyZWl0byBBdXRvcmFsIG5vIDkuNjEwLCBkZSAxOSBkZSBmZXZlcmVpcm8gZGUgMTk5OCwgYXJ0LiAyOSwgaW5jaXNvIElJSSwgYXV0b3Jpem8gYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIGEgZGlzcG9uaWJpbGl6YXIgZ3JhdHVpdGFtZW50ZSwgc2VtIHJlc3NhcmNpbWVudG8gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBwYXJhIGZpbnMgZGUgbGVpdHVyYSwgaW1wcmVzc8OjbyBlL291IGRvd25sb2FkIChhcXVpc2nDp8OjbykgYXRyYXbDqXMgZG8gc2l0ZSBkbyBSZXBvc2l0w7NyaW8gRGlnaXRhbCBkYSBVRlBFIG5vIGVuZGVyZcOnbyBodHRwOi8vd3d3LnJlcG9zaXRvcmlvLnVmcGUuYnIsIGEgcGFydGlyIGRhIGRhdGEgZGUgZGVww7NzaXRvLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gUmVzdHJpdG86CgpOYSBxdWFsaWRhZGUgZGUgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGUgYXV0b3IgcXVlIHJlY2FlbSBzb2JyZSBlc3RlIGRvY3VtZW50bywgZnVuZGFtZW50YWRvIG5hIExlaSBkZSBEaXJlaXRvIEF1dG9yYWwgbm8gOS42MTAgZGUgMTkgZGUgZmV2ZXJlaXJvIGRlIDE5OTgsIGFydC4gMjksIGluY2lzbyBJSUksIGF1dG9yaXpvIGEgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgUGVybmFtYnVjbyBhIGRpc3BvbmliaWxpemFyIGdyYXR1aXRhbWVudGUsIHNlbSByZXNzYXJjaW1lbnRvIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgcGFyYSBmaW5zIGRlIGxlaXR1cmEsIGltcHJlc3PDo28gZS9vdSBkb3dubG9hZCAoYXF1aXNpw6fDo28pIGF0cmF2w6lzIGRvIHNpdGUgZG8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRSBubyBlbmRlcmXDp28gaHR0cDovL3d3dy5yZXBvc2l0b3Jpby51ZnBlLmJyLCBxdWFuZG8gZmluZGFyIG8gcGVyw61vZG8gZGUgZW1iYXJnbyBjb25kaXplbnRlIGFvIHRpcG8gZGUgZG9jdW1lbnRvLCBjb25mb3JtZSBpbmRpY2FkbyBubyBjYW1wbyBEYXRhIGRlIEVtYmFyZ28uCg==Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212023-01-06T05:23:20Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false |
dc.title.pt_BR.fl_str_mv |
A machine learning approach to escaped defect analysis |
title |
A machine learning approach to escaped defect analysis |
spellingShingle |
A machine learning approach to escaped defect analysis NEPOMUCENO, Késsia Thais Cavalcanti Análise de defeitos escapados Ranking Automação Aprendizagem de máquina |
title_short |
A machine learning approach to escaped defect analysis |
title_full |
A machine learning approach to escaped defect analysis |
title_fullStr |
A machine learning approach to escaped defect analysis |
title_full_unstemmed |
A machine learning approach to escaped defect analysis |
title_sort |
A machine learning approach to escaped defect analysis |
author |
NEPOMUCENO, Késsia Thais Cavalcanti |
author_facet |
NEPOMUCENO, Késsia Thais Cavalcanti |
author_role |
author |
dc.contributor.authorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/1276337923168691 |
dc.contributor.advisorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/2984888073123287 |
dc.contributor.author.fl_str_mv |
NEPOMUCENO, Késsia Thais Cavalcanti |
dc.contributor.advisor1.fl_str_mv |
PRUDÊNCIO, Ricardo Bastos Cavalcante |
contributor_str_mv |
PRUDÊNCIO, Ricardo Bastos Cavalcante |
dc.subject.por.fl_str_mv |
Análise de defeitos escapados Ranking Automação Aprendizagem de máquina |
topic |
Análise de defeitos escapados Ranking Automação Aprendizagem de máquina |
description |
Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources. |
publishDate |
2022 |
dc.date.issued.fl_str_mv |
2022-08-11 |
dc.date.accessioned.fl_str_mv |
2023-01-05T14:02:43Z |
dc.date.available.fl_str_mv |
2023-01-05T14:02:43Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufpe.br/handle/123456789/48529 |
identifier_str_mv |
NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022. |
url |
https://repositorio.ufpe.br/handle/123456789/48529 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/embargoedAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
eu_rights_str_mv |
embargoedAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.publisher.program.fl_str_mv |
Programa de Pos Graduacao em Ciencia da Computacao |
dc.publisher.initials.fl_str_mv |
UFPE |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE |
instname_str |
Universidade Federal de Pernambuco (UFPE) |
instacron_str |
UFPE |
institution |
UFPE |
reponame_str |
Repositório Institucional da UFPE |
collection |
Repositório Institucional da UFPE |
bitstream.url.fl_str_mv |
https://repositorio.ufpe.br/bitstream/123456789/48529/1/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf https://repositorio.ufpe.br/bitstream/123456789/48529/2/license_rdf https://repositorio.ufpe.br/bitstream/123456789/48529/3/license.txt https://repositorio.ufpe.br/bitstream/123456789/48529/4/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.txt https://repositorio.ufpe.br/bitstream/123456789/48529/5/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.jpg |
bitstream.checksum.fl_str_mv |
f0ec1f5497927cfbc73c0bf22f50d649 e39d27027a6cc9cb039ad269a5db8e34 5e89a1613ddc8510c6576f4b23a78973 1793177d3c23c1388e78598a5a714f19 b7e0b0a4c85572eaca97ce8f65f580f9 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE) |
repository.mail.fl_str_mv |
attena@ufpe.br |
_version_ |
1802310894252720128 |