Applying Data Mining Techniques to Improve Breast Cancer Diagnosis

Detalhes bibliográficos
Autor(a) principal: Diz, Joana
Data de Publicação: 2016
Outros Autores: Marreiros, Goreti, Freitas, Alberto
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10400.22/9380
Resumo: In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups.
id RCAP_9398d7296fec4be4cd65c30f1bb1348b
oai_identifier_str oai:recipp.ipp.pt:10400.22/9380
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Applying Data Mining Techniques to Improve Breast Cancer DiagnosisBreast cancer diagnosisFeatures extractionData mining techniquesIn the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups.Springer VerlagRepositório Científico do Instituto Politécnico do PortoDiz, JoanaMarreiros, GoretiFreitas, Alberto2016-082117-08-01T00:00:00Z2016-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/9380eng10.1007/s10916-016-0561-ymetadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T12:50:45Zoai:recipp.ipp.pt:10400.22/9380Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:29:59.700729Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
title Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
spellingShingle Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
Diz, Joana
Breast cancer diagnosis
Features extraction
Data mining techniques
title_short Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
title_full Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
title_fullStr Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
title_full_unstemmed Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
title_sort Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
author Diz, Joana
author_facet Diz, Joana
Marreiros, Goreti
Freitas, Alberto
author_role author
author2 Marreiros, Goreti
Freitas, Alberto
author2_role author
author
dc.contributor.none.fl_str_mv Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv Diz, Joana
Marreiros, Goreti
Freitas, Alberto
dc.subject.por.fl_str_mv Breast cancer diagnosis
Features extraction
Data mining techniques
topic Breast cancer diagnosis
Features extraction
Data mining techniques
description In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups.
publishDate 2016
dc.date.none.fl_str_mv 2016-08
2016-08-01T00:00:00Z
2117-08-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.22/9380
url http://hdl.handle.net/10400.22/9380
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1007/s10916-016-0561-y
dc.rights.driver.fl_str_mv metadata only access
info:eu-repo/semantics/openAccess
rights_invalid_str_mv metadata only access
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Springer Verlag
publisher.none.fl_str_mv Springer Verlag
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1817552044453527552