Applying Data Mining Techniques to Improve Breast Cancer Diagnosis
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/9380 |
Resumo: | In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups. |
id |
RCAP_9398d7296fec4be4cd65c30f1bb1348b |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/9380 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Applying Data Mining Techniques to Improve Breast Cancer DiagnosisBreast cancer diagnosisFeatures extractionData mining techniquesIn the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups.Springer VerlagRepositório Científico do Instituto Politécnico do PortoDiz, JoanaMarreiros, GoretiFreitas, Alberto2016-082117-08-01T00:00:00Z2016-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.22/9380eng10.1007/s10916-016-0561-ymetadata only accessinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-03-13T12:50:45Zoai:recipp.ipp.pt:10400.22/9380Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:29:59.700729Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
title |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
spellingShingle |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis Diz, Joana Breast cancer diagnosis Features extraction Data mining techniques |
title_short |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
title_full |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
title_fullStr |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
title_full_unstemmed |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
title_sort |
Applying Data Mining Techniques to Improve Breast Cancer Diagnosis |
author |
Diz, Joana |
author_facet |
Diz, Joana Marreiros, Goreti Freitas, Alberto |
author_role |
author |
author2 |
Marreiros, Goreti Freitas, Alberto |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Diz, Joana Marreiros, Goreti Freitas, Alberto |
dc.subject.por.fl_str_mv |
Breast cancer diagnosis Features extraction Data mining techniques |
topic |
Breast cancer diagnosis Features extraction Data mining techniques |
description |
In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-08 2016-08-01T00:00:00Z 2117-08-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/9380 |
url |
http://hdl.handle.net/10400.22/9380 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.1007/s10916-016-0561-y |
dc.rights.driver.fl_str_mv |
metadata only access info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
metadata only access |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Springer Verlag |
publisher.none.fl_str_mv |
Springer Verlag |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1817552044453527552 |