A machine learning approach to escaped defect analysis

NEPOMUCENO, Késsia Thais Cavalcanti

A machine learning approach to escaped defect analysis

Detalhes bibliográficos
Autor(a) principal:	NEPOMUCENO, Késsia Thais Cavalcanti
Data de Publicação:	2022
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Institucional da UFPE
Texto Completo:	https://repositorio.ufpe.br/handle/123456789/48529
Resumo:	Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.

Metadados do item

id	UFPE_34abf5d0e45db478c6e2efc6272932b9
oai_identifier_str	oai:repositorio.ufpe.br:123456789/48529
network_acronym_str	UFPE
network_name_str	Repositório Institucional da UFPE
repository_id_str	2221
spelling	NEPOMUCENO, Késsia Thais Cavalcantihttp://lattes.cnpq.br/1276337923168691http://lattes.cnpq.br/2984888073123287PRUDÊNCIO, Ricardo Bastos Cavalcante2023-01-05T14:02:43Z2023-01-05T14:02:43Z2022-08-11NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.https://repositorio.ufpe.br/handle/123456789/48529Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.FACEPEDefeitos em sistemas ou aplicações computacionais impactam diretamente a qua- lidade e performance de um produto final, gerando consequências para o usuário e o fornecedor. Portanto, identificar o defeito escapado não detectado pelo testador na devida etapa, e incorporado ao produto, torna-se uma das principais atividades na indústria de software. Com o objetivo de mitigar ou eliminar os defeitos escapados empresas costumam ter um setor responsável pela análise e avaliação dos bugs perdidos para entender o con- texto em que eles estão inseridos e corrigir as falhas. Busca-se evitar a sua repetição e ter um ganho na qualidade do produto e performance dos testes. A análise de defeitos escapa- dos também mede o desempenho da equipe de testes, bem como do lançamento de novos produtos e serviços. Entretanto, apesar de ser uma atividade crucial, ela exige recursos como tempo, equipamentos, treinamentos e outros, tornando-se inviável a sua aplicação consistente e precisa. Por isso, em parceria com a Motorola Mobility, construímos um sistema de aprendizagem de máquina para automatizar a análise de defeitos escapados e otimizar o processo manual, diminuindo os recursos investidos nas etapas da análise. A empresa forneceu-nos informações sobre o processo, tais como os dados históricos re- ferentes às últimas análises feitas de forma manual por funcionários da empresa. Deste modo, nosso modelo conta com Bug Reports reais da indústria para dados históricos. Co- letamos, tratamos e utilizamos como entrada para o nosso modelo os dados referentes aos defeitos escapados e não escapados do Bug Report da Motorola e empregamos o Random Forest como classificador principal, resultando no ranking dos Bug Reports com maior probabilidade de se serem um defeito escapado. Para medir o desempenho do classificador, utilizamos a Curva ROC e uma nova métrica que propusemos, a curva de custo-benefício. Em ambas as métricas, obtivemos resultados significativos e promissores. Dito isso, nossas principais contribuições com esse trabalho foram o sistema de análise de efeitos escapados e a métrica curva custo-benefício que utilizamos para medir o desempenho do nosso sis- tema. Logo, os testadores da indústria de software poderão concentrar e direcionar seus esforços nos Bug Reports com maior ou menor probabilidade de se tornarem um defeito escapado, otimizando recursos de operação de trabalho.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessAnálise de defeitos escapadosRankingAutomaçãoAprendizagem de máquinaA machine learning approach to escaped defect analysisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdfDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdfapplication/pdf2206521https://repositorio.ufpe.br/bitstream/123456789/48529/1/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdff0ec1f5497927cfbc73c0bf22f50d649MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/48529/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82362https://repositorio.ufpe.br/bitstream/123456789/48529/3/license.txt5e89a1613ddc8510c6576f4b23a78973MD53TEXTDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.txtDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.txtExtracted texttext/plain121749https://repositorio.ufpe.br/bitstream/123456789/48529/4/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.txt1793177d3c23c1388e78598a5a714f19MD54THUMBNAILDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.jpgDISSERTAÇÃO Késsia Thais Cavalcanti Nepomuceno.pdf.jpgGenerated Thumbnailimage/jpeg1204https://repositorio.ufpe.br/bitstream/123456789/48529/5/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.jpgb7e0b0a4c85572eaca97ce8f65f580f9MD55123456789/485292023-01-06 02:23:20.032oai:repositorio.ufpe.br:123456789/48529VGVybW8gZGUgRGVww7NzaXRvIExlZ2FsIGUgQXV0b3JpemHDp8OjbyBwYXJhIFB1YmxpY2l6YcOnw6NvIGRlIERvY3VtZW50b3Mgbm8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRQoKCkRlY2xhcm8gZXN0YXIgY2llbnRlIGRlIHF1ZSBlc3RlIFRlcm1vIGRlIERlcMOzc2l0byBMZWdhbCBlIEF1dG9yaXphw6fDo28gdGVtIG8gb2JqZXRpdm8gZGUgZGl2dWxnYcOnw6NvIGRvcyBkb2N1bWVudG9zIGRlcG9zaXRhZG9zIG5vIFJlcG9zaXTDs3JpbyBEaWdpdGFsIGRhIFVGUEUgZSBkZWNsYXJvIHF1ZToKCkkgLSBvcyBkYWRvcyBwcmVlbmNoaWRvcyBubyBmb3JtdWzDoXJpbyBkZSBkZXDDs3NpdG8gc8OjbyB2ZXJkYWRlaXJvcyBlIGF1dMOqbnRpY29zOwoKSUkgLSAgbyBjb250ZcO6ZG8gZGlzcG9uaWJpbGl6YWRvIMOpIGRlIHJlc3BvbnNhYmlsaWRhZGUgZGUgc3VhIGF1dG9yaWE7CgpJSUkgLSBvIGNvbnRlw7pkbyDDqSBvcmlnaW5hbCwgZSBzZSBvIHRyYWJhbGhvIGUvb3UgcGFsYXZyYXMgZGUgb3V0cmFzIHBlc3NvYXMgZm9yYW0gdXRpbGl6YWRvcywgZXN0YXMgZm9yYW0gZGV2aWRhbWVudGUgcmVjb25oZWNpZGFzOwoKSVYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIG9icmEgY29sZXRpdmEgKG1haXMgZGUgdW0gYXV0b3IpOiB0b2RvcyBvcyBhdXRvcmVzIGVzdMOjbyBjaWVudGVzIGRvIGRlcMOzc2l0byBlIGRlIGFjb3JkbyBjb20gZXN0ZSB0ZXJtbzsKClYgLSBxdWFuZG8gdHJhdGFyLXNlIGRlIFRyYWJhbGhvIGRlIENvbmNsdXPDo28gZGUgQ3Vyc28sIERpc3NlcnRhw6fDo28gb3UgVGVzZTogbyBhcnF1aXZvIGRlcG9zaXRhZG8gY29ycmVzcG9uZGUgw6AgdmVyc8OjbyBmaW5hbCBkbyB0cmFiYWxobzsKClZJIC0gcXVhbmRvIHRyYXRhci1zZSBkZSBUcmFiYWxobyBkZSBDb25jbHVzw6NvIGRlIEN1cnNvLCBEaXNzZXJ0YcOnw6NvIG91IFRlc2U6IGVzdG91IGNpZW50ZSBkZSBxdWUgYSBhbHRlcmHDp8OjbyBkYSBtb2RhbGlkYWRlIGRlIGFjZXNzbyBhbyBkb2N1bWVudG8gYXDDs3MgbyBkZXDDs3NpdG8gZSBhbnRlcyBkZSBmaW5kYXIgbyBwZXLDrW9kbyBkZSBlbWJhcmdvLCBxdWFuZG8gZm9yIGVzY29saGlkbyBhY2Vzc28gcmVzdHJpdG8sIHNlcsOhIHBlcm1pdGlkYSBtZWRpYW50ZSBzb2xpY2l0YcOnw6NvIGRvIChhKSBhdXRvciAoYSkgYW8gU2lzdGVtYSBJbnRlZ3JhZG8gZGUgQmlibGlvdGVjYXMgZGEgVUZQRSAoU0lCL1VGUEUpLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gQWJlcnRvOgoKTmEgcXVhbGlkYWRlIGRlIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIGRlIGF1dG9yIHF1ZSByZWNhZW0gc29icmUgZXN0ZSBkb2N1bWVudG8sIGZ1bmRhbWVudGFkbyBuYSBMZWkgZGUgRGlyZWl0byBBdXRvcmFsIG5vIDkuNjEwLCBkZSAxOSBkZSBmZXZlcmVpcm8gZGUgMTk5OCwgYXJ0LiAyOSwgaW5jaXNvIElJSSwgYXV0b3Jpem8gYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIGEgZGlzcG9uaWJpbGl6YXIgZ3JhdHVpdGFtZW50ZSwgc2VtIHJlc3NhcmNpbWVudG8gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBwYXJhIGZpbnMgZGUgbGVpdHVyYSwgaW1wcmVzc8OjbyBlL291IGRvd25sb2FkIChhcXVpc2nDp8OjbykgYXRyYXbDqXMgZG8gc2l0ZSBkbyBSZXBvc2l0w7NyaW8gRGlnaXRhbCBkYSBVRlBFIG5vIGVuZGVyZcOnbyBodHRwOi8vd3d3LnJlcG9zaXRvcmlvLnVmcGUuYnIsIGEgcGFydGlyIGRhIGRhdGEgZGUgZGVww7NzaXRvLgoKIApQYXJhIHRyYWJhbGhvcyBlbSBBY2Vzc28gUmVzdHJpdG86CgpOYSBxdWFsaWRhZGUgZGUgdGl0dWxhciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGUgYXV0b3IgcXVlIHJlY2FlbSBzb2JyZSBlc3RlIGRvY3VtZW50bywgZnVuZGFtZW50YWRvIG5hIExlaSBkZSBEaXJlaXRvIEF1dG9yYWwgbm8gOS42MTAgZGUgMTkgZGUgZmV2ZXJlaXJvIGRlIDE5OTgsIGFydC4gMjksIGluY2lzbyBJSUksIGF1dG9yaXpvIGEgVW5pdmVyc2lkYWRlIEZlZGVyYWwgZGUgUGVybmFtYnVjbyBhIGRpc3BvbmliaWxpemFyIGdyYXR1aXRhbWVudGUsIHNlbSByZXNzYXJjaW1lbnRvIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgcGFyYSBmaW5zIGRlIGxlaXR1cmEsIGltcHJlc3PDo28gZS9vdSBkb3dubG9hZCAoYXF1aXNpw6fDo28pIGF0cmF2w6lzIGRvIHNpdGUgZG8gUmVwb3NpdMOzcmlvIERpZ2l0YWwgZGEgVUZQRSBubyBlbmRlcmXDp28gaHR0cDovL3d3dy5yZXBvc2l0b3Jpby51ZnBlLmJyLCBxdWFuZG8gZmluZGFyIG8gcGVyw61vZG8gZGUgZW1iYXJnbyBjb25kaXplbnRlIGFvIHRpcG8gZGUgZG9jdW1lbnRvLCBjb25mb3JtZSBpbmRpY2FkbyBubyBjYW1wbyBEYXRhIGRlIEVtYmFyZ28uCg==Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212023-01-06T05:23:20Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv	A machine learning approach to escaped defect analysis
title	A machine learning approach to escaped defect analysis
spellingShingle	A machine learning approach to escaped defect analysis NEPOMUCENO, Késsia Thais Cavalcanti Análise de defeitos escapados Ranking Automação Aprendizagem de máquina
title_short	A machine learning approach to escaped defect analysis
title_full	A machine learning approach to escaped defect analysis
title_fullStr	A machine learning approach to escaped defect analysis
title_full_unstemmed	A machine learning approach to escaped defect analysis
title_sort	A machine learning approach to escaped defect analysis
author	NEPOMUCENO, Késsia Thais Cavalcanti
author_facet	NEPOMUCENO, Késsia Thais Cavalcanti
author_role	author
dc.contributor.authorLattes.pt_BR.fl_str_mv	http://lattes.cnpq.br/1276337923168691
dc.contributor.advisorLattes.pt_BR.fl_str_mv	http://lattes.cnpq.br/2984888073123287
dc.contributor.author.fl_str_mv	NEPOMUCENO, Késsia Thais Cavalcanti
dc.contributor.advisor1.fl_str_mv	PRUDÊNCIO, Ricardo Bastos Cavalcante
contributor_str_mv	PRUDÊNCIO, Ricardo Bastos Cavalcante
dc.subject.por.fl_str_mv	Análise de defeitos escapados Ranking Automação Aprendizagem de máquina
topic	Análise de defeitos escapados Ranking Automação Aprendizagem de máquina
description	Defects in computer systems or applications directly impact the quality and perfor- mance of a final product, generating consequences for the user and the supplier. Therefore, identifying the escaped defect not detected by the tester at the proper stage, thus, in- corporating it into the product, is one of the software industry’s primary activities. To mitigate or eliminate the missing defects, companies usually have a sector responsible for analyzing and evaluating the lost bugs to understand the context in which they are inserted and correct the flaws. The aim is to avoid repetition and improve product quality and test performance. The analysis of escaped defects also measures the testing team’s performance and the launch of new products and services. However, despite being a cru- cial activity, it requires resources such as time, equipment, training and others, making its consistent and precise application unfeasible. Because of this, in partnership with Mo- torola Mobility, we built a machine learning system to automate the analysis of escaped defects and optimize the manual process, reducing the resources invested in the stages of analysis. For this, the company provided us with information about the process, such as historical data regarding their latest analyzes performed manually by company employees. Thus, our model relies on real industry bug reports for historical data. From the Motorola Bug Report, we collected, processed and used as input to our model the data referring to the escaped and non-escaped defects, and applied Random Forest as the main classifier. As a result, we ranked the Bug Reports most likely to become an escaped defect. To measure the classifier’s performance, we used the ROC Curve and a new metric that we proposed, the cost-benefit curve. In both metrics, we obtained significant and promising results. That said, our main contributions with this work were the escaped defect analysis system and the cost-benefit curve metric that we used to measure the performance of our system. Therefore, testers in the software industry will be able to focus and direct their efforts on those Bug Reports that are more or less likely to become an escaped defect, optimizing work operation resources.
publishDate	2022
dc.date.issued.fl_str_mv	2022-08-11
dc.date.accessioned.fl_str_mv	2023-01-05T14:02:43Z
dc.date.available.fl_str_mv	2023-01-05T14:02:43Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.
dc.identifier.uri.fl_str_mv	https://repositorio.ufpe.br/handle/123456789/48529
identifier_str_mv	NEPOMUCENO, Késsia Thais Cavalcanti. A machine learning approach to escaped defect analysis. 2022. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.
url	https://repositorio.ufpe.br/handle/123456789/48529
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/embargoedAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv	embargoedAccess
dc.publisher.none.fl_str_mv	Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv	Programa de Pos Graduacao em Ciencia da Computacao
dc.publisher.initials.fl_str_mv	UFPE
dc.publisher.country.fl_str_mv	Brasil
publisher.none.fl_str_mv	Universidade Federal de Pernambuco
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE
instname_str	Universidade Federal de Pernambuco (UFPE)
instacron_str	UFPE
institution	UFPE
reponame_str	Repositório Institucional da UFPE
collection	Repositório Institucional da UFPE
bitstream.url.fl_str_mv	https://repositorio.ufpe.br/bitstream/123456789/48529/1/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf https://repositorio.ufpe.br/bitstream/123456789/48529/2/license_rdf https://repositorio.ufpe.br/bitstream/123456789/48529/3/license.txt https://repositorio.ufpe.br/bitstream/123456789/48529/4/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.txt https://repositorio.ufpe.br/bitstream/123456789/48529/5/DISSERTA%c3%87%c3%83O%20K%c3%a9ssia%20Thais%20Cavalcanti%20Nepomuceno.pdf.jpg
bitstream.checksum.fl_str_mv	f0ec1f5497927cfbc73c0bf22f50d649 e39d27027a6cc9cb039ad269a5db8e34 5e89a1613ddc8510c6576f4b23a78973 1793177d3c23c1388e78598a5a714f19 b7e0b0a4c85572eaca97ce8f65f580f9
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv	attena@ufpe.br
_version_	1802310894252720128

A machine learning approach to escaped defect analysis

Registros relacionados