Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches
Autor(a) principal: | |
---|---|
Data de Publicação: | 2009 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | INFOCOMP: Jornal de Ciência da Computação |
Texto Completo: | https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279 |
Resumo: | Gene expression data usually contains a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biological samples of different types. In this paper, we propose a two-stage selection algorithm for genomic data by combining MRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm): In the first stage, MRMR is used to filter noisy and redundant genes in high dimensional microarray data. In the second stage, the GA uses the classifier accuracy as a fitness function to select the highly discriminating genes. The proposed method is tested on five open datasets: NCI, Lymphoma, Lung, Leukemia and Colon using Support Vector Machine and Naïve Bayes classifiers. The comparison of the MRMR-GA with MRMR filter and GA wrapper shows that our method is able to find the smallest gene subset that gives the most classification accuracy in leave-one-out cross-validation (LOOCV). |
id |
UFLA-5_043fa709260ceb2d88ff7a12679a5e9d |
---|---|
oai_identifier_str |
oai:infocomp.dcc.ufla.br:article/279 |
network_acronym_str |
UFLA-5 |
network_name_str |
INFOCOMP: Jornal de Ciência da Computação |
repository_id_str |
|
spelling |
Feature Selection For Genomic Data By Combining Filter And Wrapper ApproachesFeature selectionGenetic algorithmMRMRSupport Vector MachineNaïve Bayes classi- fierLOOCVGene expression data usually contains a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biological samples of different types. In this paper, we propose a two-stage selection algorithm for genomic data by combining MRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm): In the first stage, MRMR is used to filter noisy and redundant genes in high dimensional microarray data. In the second stage, the GA uses the classifier accuracy as a fitness function to select the highly discriminating genes. The proposed method is tested on five open datasets: NCI, Lymphoma, Lung, Leukemia and Colon using Support Vector Machine and Naïve Bayes classifiers. The comparison of the MRMR-GA with MRMR filter and GA wrapper shows that our method is able to find the smallest gene subset that gives the most classification accuracy in leave-one-out cross-validation (LOOCV).Editora da UFLA2009-12-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279INFOCOMP Journal of Computer Science; Vol. 8 No. 4 (2009): December, 2009; 28-361982-33631807-4545reponame:INFOCOMP: Jornal de Ciência da Computaçãoinstname:Universidade Federal de Lavras (UFLA)instacron:UFLAenghttps://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279/264Copyright (c) 2016 INFOCOMP Journal of Computer Scienceinfo:eu-repo/semantics/openAccessAkadi, Ali ElAmine, AouatifEl Ouardighi, AbdeljalilAboutajdine, Driss2015-07-22T18:26:29Zoai:infocomp.dcc.ufla.br:article/279Revistahttps://infocomp.dcc.ufla.br/index.php/infocompPUBhttps://infocomp.dcc.ufla.br/index.php/infocomp/oaiinfocomp@dcc.ufla.br||apfreire@dcc.ufla.br1982-33631807-4545opendoar:2024-05-21T19:54:29.365809INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA)true |
dc.title.none.fl_str_mv |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
title |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
spellingShingle |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches Akadi, Ali El Feature selection Genetic algorithm MRMR Support Vector Machine Naïve Bayes classi- fier LOOCV |
title_short |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
title_full |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
title_fullStr |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
title_full_unstemmed |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
title_sort |
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches |
author |
Akadi, Ali El |
author_facet |
Akadi, Ali El Amine, Aouatif El Ouardighi, Abdeljalil Aboutajdine, Driss |
author_role |
author |
author2 |
Amine, Aouatif El Ouardighi, Abdeljalil Aboutajdine, Driss |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
Akadi, Ali El Amine, Aouatif El Ouardighi, Abdeljalil Aboutajdine, Driss |
dc.subject.por.fl_str_mv |
Feature selection Genetic algorithm MRMR Support Vector Machine Naïve Bayes classi- fier LOOCV |
topic |
Feature selection Genetic algorithm MRMR Support Vector Machine Naïve Bayes classi- fier LOOCV |
description |
Gene expression data usually contains a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biological samples of different types. In this paper, we propose a two-stage selection algorithm for genomic data by combining MRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm): In the first stage, MRMR is used to filter noisy and redundant genes in high dimensional microarray data. In the second stage, the GA uses the classifier accuracy as a fitness function to select the highly discriminating genes. The proposed method is tested on five open datasets: NCI, Lymphoma, Lung, Leukemia and Colon using Support Vector Machine and Naïve Bayes classifiers. The comparison of the MRMR-GA with MRMR filter and GA wrapper shows that our method is able to find the smallest gene subset that gives the most classification accuracy in leave-one-out cross-validation (LOOCV). |
publishDate |
2009 |
dc.date.none.fl_str_mv |
2009-12-01 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279 |
url |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://infocomp.dcc.ufla.br/index.php/infocomp/article/view/279/264 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2016 INFOCOMP Journal of Computer Science info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2016 INFOCOMP Journal of Computer Science |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Editora da UFLA |
publisher.none.fl_str_mv |
Editora da UFLA |
dc.source.none.fl_str_mv |
INFOCOMP Journal of Computer Science; Vol. 8 No. 4 (2009): December, 2009; 28-36 1982-3363 1807-4545 reponame:INFOCOMP: Jornal de Ciência da Computação instname:Universidade Federal de Lavras (UFLA) instacron:UFLA |
instname_str |
Universidade Federal de Lavras (UFLA) |
instacron_str |
UFLA |
institution |
UFLA |
reponame_str |
INFOCOMP: Jornal de Ciência da Computação |
collection |
INFOCOMP: Jornal de Ciência da Computação |
repository.name.fl_str_mv |
INFOCOMP: Jornal de Ciência da Computação - Universidade Federal de Lavras (UFLA) |
repository.mail.fl_str_mv |
infocomp@dcc.ufla.br||apfreire@dcc.ufla.br |
_version_ |
1799874740909768704 |