Assessment of data-driven bayesian networks in software effort prediction
Autor(a) principal: | |
---|---|
Data de Publicação: | 2013 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da UFRGS |
Texto Completo: | http://hdl.handle.net/10183/71952 |
Resumo: | Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction. |
id |
URGS_5ead271d75c6d7759fe7524b3c7c0c50 |
---|---|
oai_identifier_str |
oai:www.lume.ufrgs.br:10183/71952 |
network_acronym_str |
URGS |
network_name_str |
Biblioteca Digital de Teses e Dissertações da UFRGS |
repository_id_str |
1853 |
spelling |
Tierno, Ivan Alexandre PaizNunes, Daltro José2013-05-25T01:46:34Z2013http://hdl.handle.net/10183/71952000881231Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.application/pdfengRedes bayesianasAprendizagem : MaquinaRedes : ComputadoresEngenharia : SoftwareSoftware effort predictionBayesian networksMachine learningData miningAssessment of data-driven bayesian networks in software effort predictioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisUniversidade Federal do Rio Grande do SulInstituto de InformáticaPrograma de Pós-Graduação em ComputaçãoPorto Alegre, BR-RS2013mestradoinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSORIGINAL000881231.pdf000881231.pdfTexto completo (inglês)application/pdf1336939http://www.lume.ufrgs.br/bitstream/10183/71952/1/000881231.pdf110eb76cf55bccf42083405544d40032MD51TEXT000881231.pdf.txt000881231.pdf.txtExtracted Texttext/plain146932http://www.lume.ufrgs.br/bitstream/10183/71952/2/000881231.pdf.txt366fe85d344431ee48fca35e9a1539a0MD52THUMBNAIL000881231.pdf.jpg000881231.pdf.jpgGenerated Thumbnailimage/jpeg1006http://www.lume.ufrgs.br/bitstream/10183/71952/3/000881231.pdf.jpg40bb1529edd4e098522c19067dd45062MD5310183/719522021-05-26 04:27:34.932574oai:www.lume.ufrgs.br:10183/71952Biblioteca Digital de Teses e Dissertaçõeshttps://lume.ufrgs.br/handle/10183/2PUBhttps://lume.ufrgs.br/oai/requestlume@ufrgs.br||lume@ufrgs.bropendoar:18532021-05-26T07:27:34Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false |
dc.title.pt_BR.fl_str_mv |
Assessment of data-driven bayesian networks in software effort prediction |
title |
Assessment of data-driven bayesian networks in software effort prediction |
spellingShingle |
Assessment of data-driven bayesian networks in software effort prediction Tierno, Ivan Alexandre Paiz Redes bayesianas Aprendizagem : Maquina Redes : Computadores Engenharia : Software Software effort prediction Bayesian networks Machine learning Data mining |
title_short |
Assessment of data-driven bayesian networks in software effort prediction |
title_full |
Assessment of data-driven bayesian networks in software effort prediction |
title_fullStr |
Assessment of data-driven bayesian networks in software effort prediction |
title_full_unstemmed |
Assessment of data-driven bayesian networks in software effort prediction |
title_sort |
Assessment of data-driven bayesian networks in software effort prediction |
author |
Tierno, Ivan Alexandre Paiz |
author_facet |
Tierno, Ivan Alexandre Paiz |
author_role |
author |
dc.contributor.author.fl_str_mv |
Tierno, Ivan Alexandre Paiz |
dc.contributor.advisor1.fl_str_mv |
Nunes, Daltro José |
contributor_str_mv |
Nunes, Daltro José |
dc.subject.por.fl_str_mv |
Redes bayesianas Aprendizagem : Maquina Redes : Computadores Engenharia : Software |
topic |
Redes bayesianas Aprendizagem : Maquina Redes : Computadores Engenharia : Software Software effort prediction Bayesian networks Machine learning Data mining |
dc.subject.eng.fl_str_mv |
Software effort prediction Bayesian networks Machine learning Data mining |
description |
Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction. |
publishDate |
2013 |
dc.date.accessioned.fl_str_mv |
2013-05-25T01:46:34Z |
dc.date.issued.fl_str_mv |
2013 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10183/71952 |
dc.identifier.nrb.pt_BR.fl_str_mv |
000881231 |
url |
http://hdl.handle.net/10183/71952 |
identifier_str_mv |
000881231 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da UFRGS instname:Universidade Federal do Rio Grande do Sul (UFRGS) instacron:UFRGS |
instname_str |
Universidade Federal do Rio Grande do Sul (UFRGS) |
instacron_str |
UFRGS |
institution |
UFRGS |
reponame_str |
Biblioteca Digital de Teses e Dissertações da UFRGS |
collection |
Biblioteca Digital de Teses e Dissertações da UFRGS |
bitstream.url.fl_str_mv |
http://www.lume.ufrgs.br/bitstream/10183/71952/1/000881231.pdf http://www.lume.ufrgs.br/bitstream/10183/71952/2/000881231.pdf.txt http://www.lume.ufrgs.br/bitstream/10183/71952/3/000881231.pdf.jpg |
bitstream.checksum.fl_str_mv |
110eb76cf55bccf42083405544d40032 366fe85d344431ee48fca35e9a1539a0 40bb1529edd4e098522c19067dd45062 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS) |
repository.mail.fl_str_mv |
lume@ufrgs.br||lume@ufrgs.br |
_version_ |
1810085256629321728 |