The NoiseFiltersR Package: Label Noise Preprocessing in R

Detalhes bibliográficos
Autor(a) principal: Morales, Pablo
Data de Publicação: 2017
Outros Autores: Luengo, Julian, Garcia, Luis P. F., Lorena, Ana C. [UNIFESP], de Carvalho, Andre C. P. L. F., Herrera, Francisco
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNIFESP
Texto Completo: https://journal.r-project.org/archive/2017/RJ-2017-027/index.html
https://repositorio.unifesp.br/handle/11600/53705
Resumo: In Data Mining, the value of extracted knowledge is directly related to the quality of the used data. This makes data preprocessing one of the most important steps in the knowledge discovery process. A common problem affecting data quality is the presence of noise. A training set with label noise can reduce the predictive performance of classification learning techniques and increase the overfitting of classification models. In this work we present the NoiseFiltersR package. It contains the first extensive R implementation of classical and state-of-the-art label noise filters, which are the most common techniques for preprocessing label noise. The algorithms used for the implementation of the label noise filters are appropriately documented and referenced. They can be called in a R-user-friendly manner, and their results are unified by means of the "filter" class, which also benefits from adapted print and summary methods.
id UFSP_5441acd8d3b7c478639dc4a2dbafced8
oai_identifier_str oai:repositorio.unifesp.br:11600/53705
network_acronym_str UFSP
network_name_str Repositório Institucional da UNIFESP
repository_id_str 3465
spelling Morales, PabloLuengo, JulianGarcia, Luis P. F.Lorena, Ana C. [UNIFESP]de Carvalho, Andre C. P. L. F.Herrera, Francisco2020-06-26T16:30:42Z2020-06-26T16:30:42Z2017https://journal.r-project.org/archive/2017/RJ-2017-027/index.htmlR Journal. Wien, v. 9, n. 1, p. 219-228, 2017.2073-4859https://repositorio.unifesp.br/handle/11600/53705WOS000404756200015.pdfWOS:000404756200015In Data Mining, the value of extracted knowledge is directly related to the quality of the used data. This makes data preprocessing one of the most important steps in the knowledge discovery process. A common problem affecting data quality is the presence of noise. A training set with label noise can reduce the predictive performance of classification learning techniques and increase the overfitting of classification models. In this work we present the NoiseFiltersR package. It contains the first extensive R implementation of classical and state-of-the-art label noise filters, which are the most common techniques for preprocessing label noise. The algorithms used for the implementation of the label noise filters are appropriately documented and referenced. They can be called in a R-user-friendly manner, and their results are unified by means of the "filter" class, which also benefits from adapted print and summary methods.Spanish Research ProjectAndalusian Research PlanBrazilian grant-CeMEAI-FAPESPFAPESPUniv Granada, Dept Comp Sci & Artificial Intelligence, E-18071 Granada, SpainUniv Sao Paulo, Inst Ciencias Matemat & Comp, Trabalhador Sao Carlense Av 400, BR-13560970 Sao Carlos, SP, BrazilUniv Fed Sao Paulo, Inst Ciencia & Tecnol, Talim St 330, BR-12231280 Sao Jose Dos Campos, SP, BrazilUniv Fed Sao Paulo, Inst Ciencia & Tecnol, Talim St 330, BR-12231280 Sao Jose Dos Campos, SP, BrazilSpanish Research Project: TIN2014-57251-PAndalusian Research Plan: P11-TIC-7765CeMEAI-FAPESP: 2013/07375-0FAPESP: 2012/22608-8FAPESP: 2011/14602-7Web of Science219-228engR Foundation Statistical ComputingR JournalThe NoiseFiltersR Package: Label Noise Preprocessing in Rinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleWien91info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNIFESPinstname:Universidade Federal de São Paulo (UNIFESP)instacron:UNIFESPORIGINALWOS000404756200015.pdfapplication/pdf180874${dspace.ui.url}/bitstream/11600/53705/1/WOS000404756200015.pdf4c76642e1f9549f8c102674a2998f959MD51open accessTEXTWOS000404756200015.pdf.txtWOS000404756200015.pdf.txtExtracted texttext/plain32117${dspace.ui.url}/bitstream/11600/53705/8/WOS000404756200015.pdf.txtc8561eb53e470fc810dbdd2d18f48c9eMD58open accessTHUMBNAILWOS000404756200015.pdf.jpgWOS000404756200015.pdf.jpgIM Thumbnailimage/jpeg6229${dspace.ui.url}/bitstream/11600/53705/10/WOS000404756200015.pdf.jpg17b115e2e7f69b83e4b2255169cfecb7MD510open access11600/537052023-06-05 19:10:27.82open accessoai:repositorio.unifesp.br:11600/53705Repositório InstitucionalPUBhttp://www.repositorio.unifesp.br/oai/requestopendoar:34652023-06-05T22:10:27Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)false
dc.title.en.fl_str_mv The NoiseFiltersR Package: Label Noise Preprocessing in R
title The NoiseFiltersR Package: Label Noise Preprocessing in R
spellingShingle The NoiseFiltersR Package: Label Noise Preprocessing in R
Morales, Pablo
title_short The NoiseFiltersR Package: Label Noise Preprocessing in R
title_full The NoiseFiltersR Package: Label Noise Preprocessing in R
title_fullStr The NoiseFiltersR Package: Label Noise Preprocessing in R
title_full_unstemmed The NoiseFiltersR Package: Label Noise Preprocessing in R
title_sort The NoiseFiltersR Package: Label Noise Preprocessing in R
author Morales, Pablo
author_facet Morales, Pablo
Luengo, Julian
Garcia, Luis P. F.
Lorena, Ana C. [UNIFESP]
de Carvalho, Andre C. P. L. F.
Herrera, Francisco
author_role author
author2 Luengo, Julian
Garcia, Luis P. F.
Lorena, Ana C. [UNIFESP]
de Carvalho, Andre C. P. L. F.
Herrera, Francisco
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Morales, Pablo
Luengo, Julian
Garcia, Luis P. F.
Lorena, Ana C. [UNIFESP]
de Carvalho, Andre C. P. L. F.
Herrera, Francisco
description In Data Mining, the value of extracted knowledge is directly related to the quality of the used data. This makes data preprocessing one of the most important steps in the knowledge discovery process. A common problem affecting data quality is the presence of noise. A training set with label noise can reduce the predictive performance of classification learning techniques and increase the overfitting of classification models. In this work we present the NoiseFiltersR package. It contains the first extensive R implementation of classical and state-of-the-art label noise filters, which are the most common techniques for preprocessing label noise. The algorithms used for the implementation of the label noise filters are appropriately documented and referenced. They can be called in a R-user-friendly manner, and their results are unified by means of the "filter" class, which also benefits from adapted print and summary methods.
publishDate 2017
dc.date.issued.fl_str_mv 2017
dc.date.accessioned.fl_str_mv 2020-06-26T16:30:42Z
dc.date.available.fl_str_mv 2020-06-26T16:30:42Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.].fl_str_mv https://journal.r-project.org/archive/2017/RJ-2017-027/index.html
dc.identifier.citation.fl_str_mv R Journal. Wien, v. 9, n. 1, p. 219-228, 2017.
dc.identifier.uri.fl_str_mv https://repositorio.unifesp.br/handle/11600/53705
dc.identifier.issn.none.fl_str_mv 2073-4859
dc.identifier.file.none.fl_str_mv WOS000404756200015.pdf
dc.identifier.wos.none.fl_str_mv WOS:000404756200015
url https://journal.r-project.org/archive/2017/RJ-2017-027/index.html
https://repositorio.unifesp.br/handle/11600/53705
identifier_str_mv R Journal. Wien, v. 9, n. 1, p. 219-228, 2017.
2073-4859
WOS000404756200015.pdf
WOS:000404756200015
dc.language.iso.fl_str_mv eng
language eng
dc.relation.ispartof.none.fl_str_mv R Journal
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 219-228
dc.coverage.none.fl_str_mv Wien
dc.publisher.none.fl_str_mv R Foundation Statistical Computing
publisher.none.fl_str_mv R Foundation Statistical Computing
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNIFESP
instname:Universidade Federal de São Paulo (UNIFESP)
instacron:UNIFESP
instname_str Universidade Federal de São Paulo (UNIFESP)
instacron_str UNIFESP
institution UNIFESP
reponame_str Repositório Institucional da UNIFESP
collection Repositório Institucional da UNIFESP
bitstream.url.fl_str_mv ${dspace.ui.url}/bitstream/11600/53705/1/WOS000404756200015.pdf
${dspace.ui.url}/bitstream/11600/53705/8/WOS000404756200015.pdf.txt
${dspace.ui.url}/bitstream/11600/53705/10/WOS000404756200015.pdf.jpg
bitstream.checksum.fl_str_mv 4c76642e1f9549f8c102674a2998f959
c8561eb53e470fc810dbdd2d18f48c9e
17b115e2e7f69b83e4b2255169cfecb7
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UNIFESP - Universidade Federal de São Paulo (UNIFESP)
repository.mail.fl_str_mv
_version_ 1802764267225612288