Evaluation of Multiclass Novelty Detection Algorithms for Data Streams
Autor(a) principal: | |
---|---|
Data de Publicação: | 2015 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://repositorio.inesctec.pt/handle/123456789/5312 http://dx.doi.org/10.1109/tkde.2015.2441713 |
Resumo: | Data stream mining is an emergent research area that investigates knowledge extraction from large amounts of continuously generated data, produced by non-stationary distribution. Novelty detection, the ability to identify new or previously unknown situations, is a useful ability for learning systems, especially when dealing with data streams, where concepts may appear, disappear, or evolve over time. There are several studies currently investigating the application of novelty detection techniques in data streams. However, there is no consensus regarding how to evaluate the performance of these techniques. In this study, we propose a new evaluation methodology for multiclass novelty detection in data streams able to deal with: i) unsupervised learning, which generates novelty patterns without an association with the true classes, where one class may be composed of a novelty set, ii) confusion matrix that increases over time, iii) confusion matrix with a column representing unknown examples, i.e., those not explained by the model, and iv) representation of the evaluation measures over time. We propose a new methodology to associate the novelty patterns detected by the algorithm, in an unsupervised fashion, with the true classes. Finally, we evaluate the performance of the proposed methodology through the use of known novelty detection algorithms with artificial and real data sets. |
id |
RCAP_e8154b1afdfb286afe3cb13d6d55d0b1 |
---|---|
oai_identifier_str |
oai:repositorio.inesctec.pt:123456789/5312 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Evaluation of Multiclass Novelty Detection Algorithms for Data StreamsData stream mining is an emergent research area that investigates knowledge extraction from large amounts of continuously generated data, produced by non-stationary distribution. Novelty detection, the ability to identify new or previously unknown situations, is a useful ability for learning systems, especially when dealing with data streams, where concepts may appear, disappear, or evolve over time. There are several studies currently investigating the application of novelty detection techniques in data streams. However, there is no consensus regarding how to evaluate the performance of these techniques. In this study, we propose a new evaluation methodology for multiclass novelty detection in data streams able to deal with: i) unsupervised learning, which generates novelty patterns without an association with the true classes, where one class may be composed of a novelty set, ii) confusion matrix that increases over time, iii) confusion matrix with a column representing unknown examples, i.e., those not explained by the model, and iv) representation of the evaluation measures over time. We propose a new methodology to associate the novelty patterns detected by the algorithm, in an unsupervised fashion, with the true classes. Finally, we evaluate the performance of the proposed methodology through the use of known novelty detection algorithms with artificial and real data sets.2018-01-03T10:35:14Z2015-01-01T00:00:00Z2015info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://repositorio.inesctec.pt/handle/123456789/5312http://dx.doi.org/10.1109/tkde.2015.2441713engde Faria,ERGoncalves,IRJoão Gamade Leon Ferreira Carvalho,ACPDFinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-05-15T10:20:45Zoai:repositorio.inesctec.pt:123456789/5312Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:53:34.732033Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
title |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
spellingShingle |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams de Faria,ER |
title_short |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
title_full |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
title_fullStr |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
title_full_unstemmed |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
title_sort |
Evaluation of Multiclass Novelty Detection Algorithms for Data Streams |
author |
de Faria,ER |
author_facet |
de Faria,ER Goncalves,IR João Gama de Leon Ferreira Carvalho,ACPDF |
author_role |
author |
author2 |
Goncalves,IR João Gama de Leon Ferreira Carvalho,ACPDF |
author2_role |
author author author |
dc.contributor.author.fl_str_mv |
de Faria,ER Goncalves,IR João Gama de Leon Ferreira Carvalho,ACPDF |
description |
Data stream mining is an emergent research area that investigates knowledge extraction from large amounts of continuously generated data, produced by non-stationary distribution. Novelty detection, the ability to identify new or previously unknown situations, is a useful ability for learning systems, especially when dealing with data streams, where concepts may appear, disappear, or evolve over time. There are several studies currently investigating the application of novelty detection techniques in data streams. However, there is no consensus regarding how to evaluate the performance of these techniques. In this study, we propose a new evaluation methodology for multiclass novelty detection in data streams able to deal with: i) unsupervised learning, which generates novelty patterns without an association with the true classes, where one class may be composed of a novelty set, ii) confusion matrix that increases over time, iii) confusion matrix with a column representing unknown examples, i.e., those not explained by the model, and iv) representation of the evaluation measures over time. We propose a new methodology to associate the novelty patterns detected by the algorithm, in an unsupervised fashion, with the true classes. Finally, we evaluate the performance of the proposed methodology through the use of known novelty detection algorithms with artificial and real data sets. |
publishDate |
2015 |
dc.date.none.fl_str_mv |
2015-01-01T00:00:00Z 2015 2018-01-03T10:35:14Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://repositorio.inesctec.pt/handle/123456789/5312 http://dx.doi.org/10.1109/tkde.2015.2441713 |
url |
http://repositorio.inesctec.pt/handle/123456789/5312 http://dx.doi.org/10.1109/tkde.2015.2441713 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131609791725568 |