An incremental local outlier detection method in the Data Stream

Detalhes bibliográficos
Autor(a) principal: Yao, H.
Data de Publicação: 2018
Outros Autores: Fu, X., Yang, Y., Postolache, O.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10071/17185
Resumo: Outlier detection has attracted a wide range of attention for its broad applications, such as fault diagnosis and intrusion detection, among which the outlier analysis in data streams with high uncertainty and infinity is more challenging. Recent major work of outlier detection has focused on principle research of the local outlier factor, and there are few studies on incremental updating strategies, which are vital to outlier detection in data streams. In this paper, a novel incremental local outlier detection approach is introduced to dynamically evaluate the local outlier in the data stream. An extended local neighborhood consisting of k nearest neighbors, reverse nearest neighbors and shared nearest neighbors is estimated for each data. The theoretical evidence of algorithm complexity for the insertion of new data and deletion of old data in the composite neighborhood shows that the amount of affected data in the incremental calculation is finite. Finally, experiments performed on both synthetic and real datasets verify its scalability and outlier detection accuracy. All results show that the proposed approach has comparable performance with state-of-the-art k nearest neighbor-based methods
id RCAP_624299e1e5ee3abef505a3619418b2be
oai_identifier_str oai:repositorio.iscte-iul.pt:10071/17185
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling An incremental local outlier detection method in the Data StreamIncremental algorithmK nearest neighborLocal outlier factorOutlier detectionOutlier detection has attracted a wide range of attention for its broad applications, such as fault diagnosis and intrusion detection, among which the outlier analysis in data streams with high uncertainty and infinity is more challenging. Recent major work of outlier detection has focused on principle research of the local outlier factor, and there are few studies on incremental updating strategies, which are vital to outlier detection in data streams. In this paper, a novel incremental local outlier detection approach is introduced to dynamically evaluate the local outlier in the data stream. An extended local neighborhood consisting of k nearest neighbors, reverse nearest neighbors and shared nearest neighbors is estimated for each data. The theoretical evidence of algorithm complexity for the insertion of new data and deletion of old data in the composite neighborhood shows that the amount of affected data in the incremental calculation is finite. Finally, experiments performed on both synthetic and real datasets verify its scalability and outlier detection accuracy. All results show that the proposed approach has comparable performance with state-of-the-art k nearest neighbor-based methodsMDPI2019-02-07T14:47:51Z2018-01-01T00:00:00Z20182019-02-07T14:33:10Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10071/17185eng2076-341710.3390/app8081248Yao, H.Fu, X.Yang, Y.Postolache, O.info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-09T17:29:52Zoai:repositorio.iscte-iul.pt:10071/17185Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T22:13:23.916254Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv An incremental local outlier detection method in the Data Stream
title An incremental local outlier detection method in the Data Stream
spellingShingle An incremental local outlier detection method in the Data Stream
Yao, H.
Incremental algorithm
K nearest neighbor
Local outlier factor
Outlier detection
title_short An incremental local outlier detection method in the Data Stream
title_full An incremental local outlier detection method in the Data Stream
title_fullStr An incremental local outlier detection method in the Data Stream
title_full_unstemmed An incremental local outlier detection method in the Data Stream
title_sort An incremental local outlier detection method in the Data Stream
author Yao, H.
author_facet Yao, H.
Fu, X.
Yang, Y.
Postolache, O.
author_role author
author2 Fu, X.
Yang, Y.
Postolache, O.
author2_role author
author
author
dc.contributor.author.fl_str_mv Yao, H.
Fu, X.
Yang, Y.
Postolache, O.
dc.subject.por.fl_str_mv Incremental algorithm
K nearest neighbor
Local outlier factor
Outlier detection
topic Incremental algorithm
K nearest neighbor
Local outlier factor
Outlier detection
description Outlier detection has attracted a wide range of attention for its broad applications, such as fault diagnosis and intrusion detection, among which the outlier analysis in data streams with high uncertainty and infinity is more challenging. Recent major work of outlier detection has focused on principle research of the local outlier factor, and there are few studies on incremental updating strategies, which are vital to outlier detection in data streams. In this paper, a novel incremental local outlier detection approach is introduced to dynamically evaluate the local outlier in the data stream. An extended local neighborhood consisting of k nearest neighbors, reverse nearest neighbors and shared nearest neighbors is estimated for each data. The theoretical evidence of algorithm complexity for the insertion of new data and deletion of old data in the composite neighborhood shows that the amount of affected data in the incremental calculation is finite. Finally, experiments performed on both synthetic and real datasets verify its scalability and outlier detection accuracy. All results show that the proposed approach has comparable performance with state-of-the-art k nearest neighbor-based methods
publishDate 2018
dc.date.none.fl_str_mv 2018-01-01T00:00:00Z
2018
2019-02-07T14:47:51Z
2019-02-07T14:33:10Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10071/17185
url http://hdl.handle.net/10071/17185
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2076-3417
10.3390/app8081248
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134690190295040