Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling

Detalhes bibliográficos
Autor(a) principal: RODRIGUES,Y.E.
Data de Publicação: 2017
Outros Autores: MANICA,E., ZIMMER,E.R., PASCOAL,T.A., MATHOTAARACHCHI,S.S., ROSA-NETO,P.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
Texto Completo: http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015
Resumo: ABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.
id SBMAC-1_54588509286cda39bb93928271774800
oai_identifier_str oai:scielo:S2179-84512017000100015
network_acronym_str SBMAC-1
network_name_str TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
repository_id_str
spelling Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversamplingk-nearest neighborSMOTEfeature selectionAlzheimer’s biomarkersAlzheimer’s disease classificationABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.Sociedade Brasileira de Matemática Aplicada e Computacional2017-04-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015TEMA (São Carlos) v.18 n.1 2017reponame:TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)instname:Sociedade Brasileira de Matemática Aplicada e Computacionalinstacron:SBMAC10.5540/tema.2017.018.01.0015info:eu-repo/semantics/openAccessRODRIGUES,Y.E.MANICA,E.ZIMMER,E.R.PASCOAL,T.A.MATHOTAARACHCHI,S.S.ROSA-NETO,P.eng2017-06-12T00:00:00Zoai:scielo:S2179-84512017000100015Revistahttp://www.scielo.br/temaPUBhttps://old.scielo.br/oai/scielo-oai.phpcastelo@icmc.usp.br2179-84511677-1966opendoar:2017-06-12T00:00TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) - Sociedade Brasileira de Matemática Aplicada e Computacionalfalse
dc.title.none.fl_str_mv Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
spellingShingle Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
RODRIGUES,Y.E.
k-nearest neighbor
SMOTE
feature selection
Alzheimer’s biomarkers
Alzheimer’s disease classification
title_short Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_full Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_fullStr Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_full_unstemmed Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_sort Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
author RODRIGUES,Y.E.
author_facet RODRIGUES,Y.E.
MANICA,E.
ZIMMER,E.R.
PASCOAL,T.A.
MATHOTAARACHCHI,S.S.
ROSA-NETO,P.
author_role author
author2 MANICA,E.
ZIMMER,E.R.
PASCOAL,T.A.
MATHOTAARACHCHI,S.S.
ROSA-NETO,P.
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv RODRIGUES,Y.E.
MANICA,E.
ZIMMER,E.R.
PASCOAL,T.A.
MATHOTAARACHCHI,S.S.
ROSA-NETO,P.
dc.subject.por.fl_str_mv k-nearest neighbor
SMOTE
feature selection
Alzheimer’s biomarkers
Alzheimer’s disease classification
topic k-nearest neighbor
SMOTE
feature selection
Alzheimer’s biomarkers
Alzheimer’s disease classification
description ABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.
publishDate 2017
dc.date.none.fl_str_mv 2017-04-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.5540/tema.2017.018.01.0015
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv text/html
dc.publisher.none.fl_str_mv Sociedade Brasileira de Matemática Aplicada e Computacional
publisher.none.fl_str_mv Sociedade Brasileira de Matemática Aplicada e Computacional
dc.source.none.fl_str_mv TEMA (São Carlos) v.18 n.1 2017
reponame:TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
instname:Sociedade Brasileira de Matemática Aplicada e Computacional
instacron:SBMAC
instname_str Sociedade Brasileira de Matemática Aplicada e Computacional
instacron_str SBMAC
institution SBMAC
reponame_str TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
collection TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
repository.name.fl_str_mv TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) - Sociedade Brasileira de Matemática Aplicada e Computacional
repository.mail.fl_str_mv castelo@icmc.usp.br
_version_ 1752122220197445632