Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling

RODRIGUES,Y.E.; MANICA,E.; ZIMMER,E.R.; PASCOAL,T.A.; MATHOTAARACHCHI,S.S.; ROSA-NETO,P.

Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling

Detalhes bibliográficos
Autor(a) principal:	RODRIGUES,Y.E.
Data de Publicação:	2017
Outros Autores:	MANICA,E., ZIMMER,E.R., PASCOAL,T.A., MATHOTAARACHCHI,S.S., ROSA-NETO,P.
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
Texto Completo:	http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015
Resumo:	ABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.

Metadados do item

id	SBMAC-1_54588509286cda39bb93928271774800
oai_identifier_str	oai:scielo:S2179-84512017000100015
network_acronym_str	SBMAC-1
network_name_str	TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
repository_id_str
spelling	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversamplingk-nearest neighborSMOTEfeature selectionAlzheimer’s biomarkersAlzheimer’s disease classificationABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.Sociedade Brasileira de Matemática Aplicada e Computacional2017-04-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015TEMA (São Carlos) v.18 n.1 2017reponame:TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)instname:Sociedade Brasileira de Matemática Aplicada e Computacionalinstacron:SBMAC10.5540/tema.2017.018.01.0015info:eu-repo/semantics/openAccessRODRIGUES,Y.E.MANICA,E.ZIMMER,E.R.PASCOAL,T.A.MATHOTAARACHCHI,S.S.ROSA-NETO,P.eng2017-06-12T00:00:00Zoai:scielo:S2179-84512017000100015Revistahttp://www.scielo.br/temaPUBhttps://old.scielo.br/oai/scielo-oai.phpcastelo@icmc.usp.br2179-84511677-1966opendoar:2017-06-12T00:00TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) - Sociedade Brasileira de Matemática Aplicada e Computacionalfalse
dc.title.none.fl_str_mv	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
spellingShingle	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling RODRIGUES,Y.E. k-nearest neighbor SMOTE feature selection Alzheimer’s biomarkers Alzheimer’s disease classification
title_short	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_full	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_fullStr	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_full_unstemmed	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
title_sort	Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling
author	RODRIGUES,Y.E.
author_facet	RODRIGUES,Y.E. MANICA,E. ZIMMER,E.R. PASCOAL,T.A. MATHOTAARACHCHI,S.S. ROSA-NETO,P.
author_role	author
author2	MANICA,E. ZIMMER,E.R. PASCOAL,T.A. MATHOTAARACHCHI,S.S. ROSA-NETO,P.
author2_role	author author author author author
dc.contributor.author.fl_str_mv	RODRIGUES,Y.E. MANICA,E. ZIMMER,E.R. PASCOAL,T.A. MATHOTAARACHCHI,S.S. ROSA-NETO,P.
dc.subject.por.fl_str_mv	k-nearest neighbor SMOTE feature selection Alzheimer’s biomarkers Alzheimer’s disease classification
topic	k-nearest neighbor SMOTE feature selection Alzheimer’s biomarkers Alzheimer’s disease classification
description	ABSTRACT Biomarkers are characteristics that are objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The combination of different biomarker modalities often allows an accurate diagnosis classification. In Alzheimer’s disease (AD), biomarkers are indispensable to identify cognitively normal individuals destined to develop dementia symptoms. However, using the combination of canonical AD biomarkers, studies have repeatedly shown poor classification rates to differentiate between AD, mild cognitive impairment and control individuals. Furthermore, the design of classifiers to access multiple biomarker combinations includes issues such as imbalance classes and missing data. Due to the number of biomarkers combinations wrappers are used to avoid multiple comparisons. Here, we compare the ability of three wrappers feature selection methods to obtain biomarker combinations which maximize classification rates. Also, as the criterion to the wrappers feature selection we use the k-nearest neighbor classifier with balance aids, random undersampling and SMOTE oversampling. Overall, our analyses showed how biomarkers combinations affect the classifier precision and how imbalance strategy improve it. We show that non-defining and non-cognitive biomarkers have less precision than cognitive measures when classifying AD. Our approach surpasses in average the support vector machine and the weighted k-nearest neighbor classifiers and reaches 94.34 ± 3.91% of precision reproducing class definitions.
publishDate	2017
dc.date.none.fl_str_mv	2017-04-01
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015
url	http://old.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512017000100015
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	10.5540/tema.2017.018.01.0015
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	text/html
dc.publisher.none.fl_str_mv	Sociedade Brasileira de Matemática Aplicada e Computacional
publisher.none.fl_str_mv	Sociedade Brasileira de Matemática Aplicada e Computacional
dc.source.none.fl_str_mv	TEMA (São Carlos) v.18 n.1 2017 reponame:TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) instname:Sociedade Brasileira de Matemática Aplicada e Computacional instacron:SBMAC
instname_str	Sociedade Brasileira de Matemática Aplicada e Computacional
instacron_str	SBMAC
institution	SBMAC
reponame_str	TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
collection	TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online)
repository.name.fl_str_mv	TEMA (Sociedade Brasileira de Matemática Aplicada e Computacional. Online) - Sociedade Brasileira de Matemática Aplicada e Computacional
repository.mail.fl_str_mv	castelo@icmc.usp.br
_version_	1752122220197445632

Wrappers Feature Selection in Alzheimer’s Biomarkers Using kNN and SMOTE Oversampling

Registros relacionados