Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
Autor(a) principal: | |
---|---|
Data de Publicação: | 2017 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) |
Texto Completo: | http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015 |
Resumo: | ABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions. |
id |
SBMAC-1_54588509286cda39bb93928271774800 |
---|---|
oai_identifier_str |
oai:scielo:S2179-84512017000100015 |
network_acronym_str |
SBMAC-1 |
network_name_str |
TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) |
repository_id_str |
|
spelling |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversamplingk-nearest neighborSMOTEfeature selectionAlzheimer’s biomarkersAlzheimer’s disease classificationABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.Sociedade Brasileira de Matemática Aplicada e Computacional2017-04-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015TEMA (São Carlos) v.18 n.1 2017reponame:TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)instname:Sociedade Brasileira de Matemática Aplicada e Computacionalinstacron:SBMAC10.5540/tema.2017.018.01.0015info:eu-repo/semantics/openAccessRODRIGUES,Y.E.MANICA,E.ZIMMER,E.R.PASCOAL,T.A.MATHOTAARACHCHI,S.S.ROSA-NETO,P.eng2017-06-12T00:00:00Zoai:scielo:S2179-84512017000100015Revistahttp://www.scielo.br/temaPUBhttps://old.scielo.br/oai/scielo-oai.phpcastelo@icmc.usp.br2179-84511677-1966opendoar:2017-06-12T00:00TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) - Sociedade Brasileira de Matemática Aplicada e Computacionalfalse |
dc.title.none.fl_str_mv |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
title |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
spellingShingle |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling RODRIGUES,Y.E. k-nearest neighbor SMOTE feature selection Alzheimer’s biomarkers Alzheimer’s disease classification |
title_short |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
title_full |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
title_fullStr |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
title_full_unstemmed |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
title_sort |
Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling |
author |
RODRIGUES,Y.E. |
author_facet |
RODRIGUES,Y.E. MANICA,E. ZIMMER,E.R. PASCOAL,T.A. MATHOTAARACHCHI,S.S. ROSA-NETO,P. |
author_role |
author |
author2 |
MANICA,E. ZIMMER,E.R. PASCOAL,T.A. MATHOTAARACHCHI,S.S. ROSA-NETO,P. |
author2_role |
author author author author author |
dc.contributor.author.fl_str_mv |
RODRIGUES,Y.E. MANICA,E. ZIMMER,E.R. PASCOAL,T.A. MATHOTAARACHCHI,S.S. ROSA-NETO,P. |
dc.subject.por.fl_str_mv |
k-nearest neighbor SMOTE feature selection Alzheimer’s biomarkers Alzheimer’s disease classification |
topic |
k-nearest neighbor SMOTE feature selection Alzheimer’s biomarkers Alzheimer’s disease classification |
description |
ABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-04-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015 |
url |
http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
10.5540/tema.2017.018.01.0015 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
text/html |
dc.publisher.none.fl_str_mv |
Sociedade Brasileira de Matemática Aplicada e Computacional |
publisher.none.fl_str_mv |
Sociedade Brasileira de Matemática Aplicada e Computacional |
dc.source.none.fl_str_mv |
TEMA (São Carlos) v.18 n.1 2017 reponame:TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) instname:Sociedade Brasileira de Matemática Aplicada e Computacional instacron:SBMAC |
instname_str |
Sociedade Brasileira de Matemática Aplicada e Computacional |
instacron_str |
SBMAC |
institution |
SBMAC |
reponame_str |
TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) |
collection |
TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) |
repository.name.fl_str_mv |
TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) - Sociedade Brasileira de Matemática Aplicada e Computacional |
repository.mail.fl_str_mv |
castelo@icmc.usp.br |
_version_ |
1752122220197445632 |