Assessment of data-driven bayesian networks in software effort prediction

Detalhes bibliográficos
Autor(a) principal: Tierno, Ivan Alexandre Paiz
Data de Publicação: 2013
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da UFRGS
Texto Completo: http://hdl.handle.net/10183/71952
Resumo: Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.
id URGS_5ead271d75c6d7759fe7524b3c7c0c50
oai_identifier_str oai:www.lume.ufrgs.br:10183/71952
network_acronym_str URGS
network_name_str Biblioteca Digital de Teses e Dissertações da UFRGS
repository_id_str 1853
spelling Tierno, Ivan Alexandre PaizNunes, Daltro José2013-05-25T01:46:34Z2013http://hdl.handle.net/10183/71952000881231Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.application/pdfengRedes bayesianasAprendizagem : MaquinaRedes : ComputadoresEngenharia : SoftwareSoftware effort predictionBayesian networksMachine learningData miningAssessment of data-driven bayesian networks in software effort predictioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisUniversidade Federal do Rio Grande do SulInstituto de InformáticaPrograma de Pós-Graduação em ComputaçãoPorto Alegre, BR-RS2013mestradoinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSORIGINAL000881231.pdf000881231.pdfTexto completo (inglês)application/pdf1336939http://www.lume.ufrgs.br/bitstream/10183/71952/1/000881231.pdf110eb76cf55bccf42083405544d40032MD51TEXT000881231.pdf.txt000881231.pdf.txtExtracted Texttext/plain146932http://www.lume.ufrgs.br/bitstream/10183/71952/2/000881231.pdf.txt366fe85d344431ee48fca35e9a1539a0MD52THUMBNAIL000881231.pdf.jpg000881231.pdf.jpgGenerated Thumbnailimage/jpeg1006http://www.lume.ufrgs.br/bitstream/10183/71952/3/000881231.pdf.jpg40bb1529edd4e098522c19067dd45062MD5310183/719522021-05-26 04:27:34.932574oai:www.lume.ufrgs.br:10183/71952Biblioteca Digital de Teses e Dissertaçõeshttps://lume.ufrgs.br/handle/10183/2PUBhttps://lume.ufrgs.br/oai/requestlume@ufrgs.br||lume@ufrgs.bropendoar:18532021-05-26T07:27:34Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false
dc.title.pt_BR.fl_str_mv Assessment of data-driven bayesian networks in software effort prediction
title Assessment of data-driven bayesian networks in software effort prediction
spellingShingle Assessment of data-driven bayesian networks in software effort prediction
Tierno, Ivan Alexandre Paiz
Redes bayesianas
Aprendizagem : Maquina
Redes : Computadores
Engenharia : Software
Software effort prediction
Bayesian networks
Machine learning
Data mining
title_short Assessment of data-driven bayesian networks in software effort prediction
title_full Assessment of data-driven bayesian networks in software effort prediction
title_fullStr Assessment of data-driven bayesian networks in software effort prediction
title_full_unstemmed Assessment of data-driven bayesian networks in software effort prediction
title_sort Assessment of data-driven bayesian networks in software effort prediction
author Tierno, Ivan Alexandre Paiz
author_facet Tierno, Ivan Alexandre Paiz
author_role author
dc.contributor.author.fl_str_mv Tierno, Ivan Alexandre Paiz
dc.contributor.advisor1.fl_str_mv Nunes, Daltro José
contributor_str_mv Nunes, Daltro José
dc.subject.por.fl_str_mv Redes bayesianas
Aprendizagem : Maquina
Redes : Computadores
Engenharia : Software
topic Redes bayesianas
Aprendizagem : Maquina
Redes : Computadores
Engenharia : Software
Software effort prediction
Bayesian networks
Machine learning
Data mining
dc.subject.eng.fl_str_mv Software effort prediction
Bayesian networks
Machine learning
Data mining
description Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.
publishDate 2013
dc.date.accessioned.fl_str_mv 2013-05-25T01:46:34Z
dc.date.issued.fl_str_mv 2013
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10183/71952
dc.identifier.nrb.pt_BR.fl_str_mv 000881231
url http://hdl.handle.net/10183/71952
identifier_str_mv 000881231
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da UFRGS
instname:Universidade Federal do Rio Grande do Sul (UFRGS)
instacron:UFRGS
instname_str Universidade Federal do Rio Grande do Sul (UFRGS)
instacron_str UFRGS
institution UFRGS
reponame_str Biblioteca Digital de Teses e Dissertações da UFRGS
collection Biblioteca Digital de Teses e Dissertações da UFRGS
bitstream.url.fl_str_mv http://www.lume.ufrgs.br/bitstream/10183/71952/1/000881231.pdf
http://www.lume.ufrgs.br/bitstream/10183/71952/2/000881231.pdf.txt
http://www.lume.ufrgs.br/bitstream/10183/71952/3/000881231.pdf.jpg
bitstream.checksum.fl_str_mv 110eb76cf55bccf42083405544d40032
366fe85d344431ee48fca35e9a1539a0
40bb1529edd4e098522c19067dd45062
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)
repository.mail.fl_str_mv lume@ufrgs.br||lume@ufrgs.br
_version_ 1810085256629321728