Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented]
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/1822/81434 |
Resumo: | Categorical Attribute traNsformation Environment (CANE) is a simpler but powerful data categorical preprocessing Python package. The package is valuable since there is currently a large range of Machine Learning (ML) algorithms that can only be trained using numerical data (e.g., Deep Learning, Support Vector Machines) and several real-world ML applications are associated with categorical data attributes. Currently, CANE offers three categorical to numeric transformation methods, namely: Percentage Categorical Pruned (PCP), Inverse Document Frequency (IDF) and a simpler One-Hot-Encoding method. Additionally, the CANE module is well documented with several code examples that can help in its adoption by non expert users. |
id |
RCAP_698376a86f20dbf9cf3131a8035a2d32 |
---|---|
oai_identifier_str |
oai:repositorium.sdum.uminho.pt:1822/81434 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented]CANEData preprocessingMachine learningPython programming languageScience & TechnologyCategorical Attribute traNsformation Environment (CANE) is a simpler but powerful data categorical preprocessing Python package. The package is valuable since there is currently a large range of Machine Learning (ML) algorithms that can only be trained using numerical data (e.g., Deep Learning, Support Vector Machines) and several real-world ML applications are associated with categorical data attributes. Currently, CANE offers three categorical to numeric transformation methods, namely: Percentage Categorical Pruned (PCP), Inverse Document Frequency (IDF) and a simpler One-Hot-Encoding method. Additionally, the CANE module is well documented with several code examples that can help in its adoption by non expert users.The authors are grateful for project NORTE-01-0247-FEDER-017497, supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). This work was also supported by FCT Fundação para a Ciência e Tecnologia, Portugal within the Project Scope: UID/CEC/00319/2019. The authors are also grateful for all the contributors that assisted in making CANE more intuitive.ElsevierUniversidade do MinhoMatos, Luís MiguelAzevedo, JoãoMatta, ArthurPilastri, AndréCortez, PauloMendes, Rui2022-08-012022-08-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttps://hdl.handle.net/1822/81434engMatos, L. M., Azevedo, J., Matta, A., Pilastri, A., Cortez, P., & Mendes, R. (2022, August). Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing. Software Impacts. Elsevier BV. http://doi.org/10.1016/j.simpa.2022.1003592665-963810.1016/j.simpa.2022.100359https://www.sciencedirect.com/science/article/pii/S2665963822000720info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-21T12:48:43Zoai:repositorium.sdum.uminho.pt:1822/81434Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T19:47:01.871126Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
title |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
spellingShingle |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] Matos, Luís Miguel CANE Data preprocessing Machine learning Python programming language Science & Technology |
title_short |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
title_full |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
title_fullStr |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
title_full_unstemmed |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
title_sort |
Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented] |
author |
Matos, Luís Miguel |
author_facet |
Matos, Luís Miguel Azevedo, João Matta, Arthur Pilastri, André Cortez, Paulo Mendes, Rui |
author_role |
author |
author2 |
Azevedo, João Matta, Arthur Pilastri, André Cortez, Paulo Mendes, Rui |
author2_role |
author author author author author |
dc.contributor.none.fl_str_mv |
Universidade do Minho |
dc.contributor.author.fl_str_mv |
Matos, Luís Miguel Azevedo, João Matta, Arthur Pilastri, André Cortez, Paulo Mendes, Rui |
dc.subject.por.fl_str_mv |
CANE Data preprocessing Machine learning Python programming language Science & Technology |
topic |
CANE Data preprocessing Machine learning Python programming language Science & Technology |
description |
Categorical Attribute traNsformation Environment (CANE) is a simpler but powerful data categorical preprocessing Python package. The package is valuable since there is currently a large range of Machine Learning (ML) algorithms that can only be trained using numerical data (e.g., Deep Learning, Support Vector Machines) and several real-world ML applications are associated with categorical data attributes. Currently, CANE offers three categorical to numeric transformation methods, namely: Percentage Categorical Pruned (PCP), Inverse Document Frequency (IDF) and a simpler One-Hot-Encoding method. Additionally, the CANE module is well documented with several code examples that can help in its adoption by non expert users. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-08-01 2022-08-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1822/81434 |
url |
https://hdl.handle.net/1822/81434 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Matos, L. M., Azevedo, J., Matta, A., Pilastri, A., Cortez, P., & Mendes, R. (2022, August). Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing. Software Impacts. Elsevier BV. http://doi.org/10.1016/j.simpa.2022.100359 2665-9638 10.1016/j.simpa.2022.100359 https://www.sciencedirect.com/science/article/pii/S2665963822000720 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799133042258739200 |