A biobjective feature selection algorithm for large omics datasets
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.18/6335 |
Resumo: | Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency‐based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the biobjective version of the algorithm logical analysis of inconsistent data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross‐validation technique. The biobjective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome‐like characteristics of patients with rare diseases. |
id |
RCAP_6136567c2b5be2a114343d5c6f06116a |
---|---|
oai_identifier_str |
oai:repositorio.insa.pt:10400.18/6335 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
A biobjective feature selection algorithm for large omics datasetsBiobjective OptimizationFeature SelectionHeuristic DecompositionLogical Analysis of DataRare Diseases.Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency‐based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the biobjective version of the algorithm logical analysis of inconsistent data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross‐validation technique. The biobjective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome‐like characteristics of patients with rare diseases.This work used the EGI, European Grid Infrastructure, with the support of the IBERGRID, Iberian Grid Infrastructure, and INCD (Portugal); NCG‐INGRID‐PT; FCT, Grant/Award Number: UID/Multi/04046/2013Expert SystemsRepositório Científico do Instituto Nacional de SaúdeCavique, LuísMendes, Armando B.Martiniano, Hugo F.M.C.Correia, Luís2019-03-28T15:58:23Z2018-06-192018-06-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.18/6335engExpert Systems. 2018;35(4):e12301.doi:10.1111/exsy.123010266-472010.1111/exsy.12301info:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-20T15:41:21Zoai:repositorio.insa.pt:10400.18/6335Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T18:40:59.165290Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
A biobjective feature selection algorithm for large omics datasets |
title |
A biobjective feature selection algorithm for large omics datasets |
spellingShingle |
A biobjective feature selection algorithm for large omics datasets Cavique, Luís Biobjective Optimization Feature Selection Heuristic Decomposition Logical Analysis of Data Rare Diseases. |
title_short |
A biobjective feature selection algorithm for large omics datasets |
title_full |
A biobjective feature selection algorithm for large omics datasets |
title_fullStr |
A biobjective feature selection algorithm for large omics datasets |
title_full_unstemmed |
A biobjective feature selection algorithm for large omics datasets |
title_sort |
A biobjective feature selection algorithm for large omics datasets |
author |
Cavique, Luís |
author_facet |
Cavique, Luís Mendes, Armando B. Martiniano, Hugo F.M.C. Correia, Luís |
author_role |
author |
author2 |
Mendes, Armando B. Martiniano, Hugo F.M.C. Correia, Luís |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Repositório Científico do Instituto Nacional de Saúde |
dc.contributor.author.fl_str_mv |
Cavique, Luís Mendes, Armando B. Martiniano, Hugo F.M.C. Correia, Luís |
dc.subject.por.fl_str_mv |
Biobjective Optimization Feature Selection Heuristic Decomposition Logical Analysis of Data Rare Diseases. |
topic |
Biobjective Optimization Feature Selection Heuristic Decomposition Logical Analysis of Data Rare Diseases. |
description |
Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency‐based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the biobjective version of the algorithm logical analysis of inconsistent data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross‐validation technique. The biobjective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome‐like characteristics of patients with rare diseases. |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-06-19 2018-06-19T00:00:00Z 2019-03-28T15:58:23Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.18/6335 |
url |
http://hdl.handle.net/10400.18/6335 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Expert Systems. 2018;35(4):e12301.doi:10.1111/exsy.12301 0266-4720 10.1111/exsy.12301 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/embargoedAccess |
eu_rights_str_mv |
embargoedAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Expert Systems |
publisher.none.fl_str_mv |
Expert Systems |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799132153058951168 |