Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling

Detalhes bibliográficos
Autor(a) principal: Rodrigues, Yuri Elias
Data de Publicação: 2017
Outros Autores: Manica, Evandro, Zimmer, Eduardo Rigon, Pascoal, Tharick Ali, Mathotaarachchi, Sulantha Sanjeewa, Rosa Neto, Pedro
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFRGS
Texto Completo: http://hdl.handle.net/10183/163334
Resumo: Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms.However, using the combination of canonicalAD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it.We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.
id UFRGS-2_588ec2a26c3d5f374a307fe6e0f24951
oai_identifier_str oai:www.lume.ufrgs.br:10183/163334
network_acronym_str UFRGS-2
network_name_str Repositório Institucional da UFRGS
repository_id_str
spelling Rodrigues, Yuri EliasManica, EvandroZimmer, Eduardo RigonPascoal, Tharick AliMathotaarachchi, Sulantha SanjeewaRosa Neto, Pedro2017-06-22T02:42:59Z20171677-1966http://hdl.handle.net/10183/163334001022889Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms.However, using the combination of canonicalAD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it.We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.application/pdfengTEMA : tendências em matemática aplicada e computacional. São Carlos. Vol. 18, no. 1 (2017), p. 15-34Modelagem matemáticak-nearest neighborSMOTEFeature selectionAlzheimer’s biomarkersAlzheimer’s disease classificationWrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversamplinginfo:eu-repo/semantics/articleinfo:eu-repo/semantics/otherinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFRGSinstname:Universidade Federal do Rio Grande do Sul (UFRGS)instacron:UFRGSORIGINAL001022889.pdf001022889.pdfTexto completo (inglês)application/pdf1915068http://www.lume.ufrgs.br/bitstream/10183/163334/1/001022889.pdf9b4805bd98bf3f171bc11b9d902f536dMD51TEXT001022889.pdf.txt001022889.pdf.txtExtracted Texttext/plain58276http://www.lume.ufrgs.br/bitstream/10183/163334/2/001022889.pdf.txtdea6bf77a2b9ac802e7c101abc6f0c1eMD52THUMBNAIL001022889.pdf.jpg001022889.pdf.jpgGenerated Thumbnailimage/jpeg1585http://www.lume.ufrgs.br/bitstream/10183/163334/3/001022889.pdf.jpgc8751d8282c17eb89141463c7a47a113MD5310183/1633342021-09-18 04:53:34.121122oai:www.lume.ufrgs.br:10183/163334Repositório InstitucionalPUBhttps://lume.ufrgs.br/oai/requestlume@ufrgs.bropendoar:2021-09-18T07:53:34Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)false
dc.title.pt_BR.fl_str_mv Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
title Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
spellingShingle Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
Rodrigues, Yuri Elias
Modelagem matemática
k-nearest neighbor
SMOTE
Feature selection
Alzheimer’s biomarkers
Alzheimer’s disease classification
title_short Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
title_full Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
title_fullStr Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
title_full_unstemmed Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
title_sort Wrappers feature selection in alzheimer’s biomarkers using kNN and SMOTE oversampling
author Rodrigues, Yuri Elias
author_facet Rodrigues, Yuri Elias
Manica, Evandro
Zimmer, Eduardo Rigon
Pascoal, Tharick Ali
Mathotaarachchi, Sulantha Sanjeewa
Rosa Neto, Pedro
author_role author
author2 Manica, Evandro
Zimmer, Eduardo Rigon
Pascoal, Tharick Ali
Mathotaarachchi, Sulantha Sanjeewa
Rosa Neto, Pedro
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Rodrigues, Yuri Elias
Manica, Evandro
Zimmer, Eduardo Rigon
Pascoal, Tharick Ali
Mathotaarachchi, Sulantha Sanjeewa
Rosa Neto, Pedro
dc.subject.por.fl_str_mv Modelagem matemática
topic Modelagem matemática
k-nearest neighbor
SMOTE
Feature selection
Alzheimer’s biomarkers
Alzheimer’s disease classification
dc.subject.eng.fl_str_mv k-nearest neighbor
SMOTE
Feature selection
Alzheimer’s biomarkers
Alzheimer’s disease classification
description Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms.However, using the combination of canonicalAD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it.We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.
publishDate 2017
dc.date.accessioned.fl_str_mv 2017-06-22T02:42:59Z
dc.date.issued.fl_str_mv 2017
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/other
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10183/163334
dc.identifier.issn.pt_BR.fl_str_mv 1677-1966
dc.identifier.nrb.pt_BR.fl_str_mv 001022889
identifier_str_mv 1677-1966
001022889
url http://hdl.handle.net/10183/163334
dc.language.iso.fl_str_mv eng
language eng
dc.relation.ispartof.pt_BR.fl_str_mv TEMA : tendências em matemática aplicada e computacional. São Carlos. Vol. 18, no. 1 (2017), p. 15-34
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFRGS
instname:Universidade Federal do Rio Grande do Sul (UFRGS)
instacron:UFRGS
instname_str Universidade Federal do Rio Grande do Sul (UFRGS)
instacron_str UFRGS
institution UFRGS
reponame_str Repositório Institucional da UFRGS
collection Repositório Institucional da UFRGS
bitstream.url.fl_str_mv http://www.lume.ufrgs.br/bitstream/10183/163334/1/001022889.pdf
http://www.lume.ufrgs.br/bitstream/10183/163334/2/001022889.pdf.txt
http://www.lume.ufrgs.br/bitstream/10183/163334/3/001022889.pdf.jpg
bitstream.checksum.fl_str_mv 9b4805bd98bf3f171bc11b9d902f536d
dea6bf77a2b9ac802e7c101abc6f0c1e
c8751d8282c17eb89141463c7a47a113
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFRGS - Universidade Federal do Rio Grande do Sul (UFRGS)
repository.mail.fl_str_mv lume@ufrgs.br
_version_ 1817725000845623296