Feature Engineering: Techniques and Applications
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://doi.org/10.34627/rcc.v18i0.295 |
Resumo: | Machine Learning is a rising concept in today's society. In the past decade, ML-based systems have become part of people's daily routines, and their usage has been disseminated through diverse sectors. This evolution is supported by the exponential increase in data created worldwide. Feature Engineering is a critical process focused on transforming data into suitable inputs for Machine Learning algorithms. This work explores the Feature Engineering process by developing a baseline for its implementation. Hence, a pipeline of Feature Engineering techniques and their taxonomy is proposed, along with a set of R scripts to implement. The validity of the code is then demonstrated through its application to a real-world dataset. |
id |
RCAP_43a8ec602f071f5561fd7164e1f7da3e |
---|---|
oai_identifier_str |
oai:ojs2.journals.uab.pt:article/295 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Feature Engineering: Techniques and ApplicationsFeature Engineering: Técnicas e AplicaçõesMachine Learning is a rising concept in today's society. In the past decade, ML-based systems have become part of people's daily routines, and their usage has been disseminated through diverse sectors. This evolution is supported by the exponential increase in data created worldwide. Feature Engineering is a critical process focused on transforming data into suitable inputs for Machine Learning algorithms. This work explores the Feature Engineering process by developing a baseline for its implementation. Hence, a pipeline of Feature Engineering techniques and their taxonomy is proposed, along with a set of R scripts to implement. The validity of the code is then demonstrated through its application to a real-world dataset.Machine Learning é um conceito em crescente evolução na sociedade atual. Na última década, os sistemas baseados em ML tornaram-se parte do quotidiano da população e a sua aplicação tem vindo a disseminar-se por diversos setores. Este crescimento é suportado pelo aumento exponencial da quantidade de dados gerados a nível mundial. Feature Engineering surge, assim, como um processo chave que permite transformar dados em inputs adequados para os algoritmos de Machine Learning. O presente trabalho pretende explorar o processo de Feature Engineering, com vista a desenvolver uma base de suporte à sua implementação. Por conseguinte, é proposta uma pipeline de técnicas de Feature Engineering em paralelo com a sua taxonomia, juntamente com um conjunto de scripts R, para as implementar. A validade do código é, posteriormente, demonstrada através da sua aplicação a um conjunto de dados reais.Universidade Aberta2023-12-18info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://doi.org/10.34627/rcc.v18i0.295https://doi.org/10.34627/rcc.v18i0.295Revista de Ciências da Computação; v. 18 (2023); 43-542182-18011646-633010.34627/rcc.v18i0reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAPenghttps://journals.uab.pt/index.php/rcc/article/view/295https://journals.uab.pt/index.php/rcc/article/view/295/251Direitos de Autor (c) 2023 Universidade Abertahttp://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessTeixeira, MarianaCavique, Luís2023-12-22T06:31:12Zoai:ojs2.journals.uab.pt:article/295Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T00:55:34.515065Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Feature Engineering: Techniques and Applications Feature Engineering: Técnicas e Aplicações |
title |
Feature Engineering: Techniques and Applications |
spellingShingle |
Feature Engineering: Techniques and Applications Teixeira, Mariana |
title_short |
Feature Engineering: Techniques and Applications |
title_full |
Feature Engineering: Techniques and Applications |
title_fullStr |
Feature Engineering: Techniques and Applications |
title_full_unstemmed |
Feature Engineering: Techniques and Applications |
title_sort |
Feature Engineering: Techniques and Applications |
author |
Teixeira, Mariana |
author_facet |
Teixeira, Mariana Cavique, Luís |
author_role |
author |
author2 |
Cavique, Luís |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Teixeira, Mariana Cavique, Luís |
description |
Machine Learning is a rising concept in today's society. In the past decade, ML-based systems have become part of people's daily routines, and their usage has been disseminated through diverse sectors. This evolution is supported by the exponential increase in data created worldwide. Feature Engineering is a critical process focused on transforming data into suitable inputs for Machine Learning algorithms. This work explores the Feature Engineering process by developing a baseline for its implementation. Hence, a pipeline of Feature Engineering techniques and their taxonomy is proposed, along with a set of R scripts to implement. The validity of the code is then demonstrated through its application to a real-world dataset. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-12-18 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://doi.org/10.34627/rcc.v18i0.295 https://doi.org/10.34627/rcc.v18i0.295 |
url |
https://doi.org/10.34627/rcc.v18i0.295 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://journals.uab.pt/index.php/rcc/article/view/295 https://journals.uab.pt/index.php/rcc/article/view/295/251 |
dc.rights.driver.fl_str_mv |
Direitos de Autor (c) 2023 Universidade Aberta http://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Direitos de Autor (c) 2023 Universidade Aberta http://creativecommons.org/licenses/by/4.0 |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidade Aberta |
publisher.none.fl_str_mv |
Universidade Aberta |
dc.source.none.fl_str_mv |
Revista de Ciências da Computação; v. 18 (2023); 43-54 2182-1801 1646-6330 10.34627/rcc.v18i0 reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136441885786112 |