Towards a general cellular frustration approach to anomaly detection

Detalhes bibliográficos
Autor(a) principal: Gonçalves, Rodrigo Faria
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/34080
Resumo: Cellular Frustrated Systems model the interactions between presenter and detector agents, with the goal of detecting anomalies in data sets. These two types of agents follow a frustrated dynamic (i.e., unstable), in which they continuously change the agent of the other type they are paired with, when a normal sample is presented by the presenter agents. In order for CFSs to make detections, presenters must show abnormal samples that have one or more abnormal features, which leads the agents to pair for longer times, leading to stable pairs, hence detections. This work improves upon the previous versions of this model by allowing detectors to see two regions of feature space as abnormal, with a normal region of feature space in-between. The K-means clustering technique is also used to cluster data in data sets, so that detectors are able to partition the feature space and be assigned to a certain region which they will see as normal, with the rest being seen as abnormal. It is shown that there is no need to train separate populations of detectors in order to make detections with data sets that have abnormal samples in-between normal ones. The current version of the model is compared with previous versions of the Cellular Frustration model, and also with two well known anomaly detection methods, the One-Class Support Vector Machines, and the Isolation Forest. The results show that it has comparable performance in relation to the competing methods, whereas regarding the previous versions, it is able to achieve the same results while being a more robust model applicable in more situations with less effort. Finally, some ideas for future work are discussed in order to further improve the model, which still has some issues when certain unfavorable conditions arise in data sets.
id RCAP_0ee9670cf98637a1606739303bea345d
oai_identifier_str oai:ria.ua.pt:10773/34080
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Towards a general cellular frustration approach to anomaly detectionCellular frustrationMachine learningUnsupervised learningAnomaly detectionOne-class support vector machinesIsolation forestk-meansCellular Frustrated Systems model the interactions between presenter and detector agents, with the goal of detecting anomalies in data sets. These two types of agents follow a frustrated dynamic (i.e., unstable), in which they continuously change the agent of the other type they are paired with, when a normal sample is presented by the presenter agents. In order for CFSs to make detections, presenters must show abnormal samples that have one or more abnormal features, which leads the agents to pair for longer times, leading to stable pairs, hence detections. This work improves upon the previous versions of this model by allowing detectors to see two regions of feature space as abnormal, with a normal region of feature space in-between. The K-means clustering technique is also used to cluster data in data sets, so that detectors are able to partition the feature space and be assigned to a certain region which they will see as normal, with the rest being seen as abnormal. It is shown that there is no need to train separate populations of detectors in order to make detections with data sets that have abnormal samples in-between normal ones. The current version of the model is compared with previous versions of the Cellular Frustration model, and also with two well known anomaly detection methods, the One-Class Support Vector Machines, and the Isolation Forest. The results show that it has comparable performance in relation to the competing methods, whereas regarding the previous versions, it is able to achieve the same results while being a more robust model applicable in more situations with less effort. Finally, some ideas for future work are discussed in order to further improve the model, which still has some issues when certain unfavorable conditions arise in data sets.Os Sistemas de Frustração Celular modelam interações entre agentes apresentadores e detetores, com o objetivo de concretizar deteções de anomalias em data sets. Estes dois tipos de agentes seguem uma dinâmica frustrada (i.e., instável), na qual continuamente trocam de agente do outro tipo com o qual estão emparelhados, quando uma amostra normal é apresentada pelos agentes apresentadores. De forma a que os SFCs consigam fazer deteções, os apresentadores têm de mostrar amostras anómalas que tenham uma ou mais características anómalas, o que leva a que os agentes emparelhem durante mais tempo, levando a emparelhamentos estáveis e consequentemente deteções. Este trabalho melhora as versões anteriores do modelo ao permitir que os detetores vejam duas regiões do espaço das características como anómalas, com uma região normal entre elas. A técnica de clustering K-means também foi utilizada para agrupar dados nos data sets, para que os detetores consigam particionar o espaço das características e sejam atribuídos a uma certa região que verão como normal, enquanto que o resto verão como anómala. Mostra-se que não existe necessidade de treinar populações de detetores separadamente para fazer deteções em data sets que tenham amostras anómalas entre amostras normais. A versão atual do modelo é comparada com versões prévias do modelo de Frustração Celular, e também com dois métodos bem conhecidos de deteção de anomalias, o One-Class Support Vector Machines, e o Isolation Forest. Os resultados mostram que tem desempenho equiparável em relação aos métodos concorrentes, enquanto que em relação às versões prévias, é capaz de atingir resultados equivalentes sendo um modelo mais robusto aplicável em mais situações com menos esforço. Por fim, algumas ideias sobre trabalho futuro são discutidas de forma a que o modelo seja melhorado, pois ainda revela alguns problemas quando certas condições pouco favoráveis surgem em data sets.2022-06-29T09:44:09Z2021-12-10T00:00:00Z2021-12-10info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/34080engGonçalves, Rodrigo Fariainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:05:40Zoai:ria.ua.pt:10773/34080Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:05:26.362180Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Towards a general cellular frustration approach to anomaly detection
title Towards a general cellular frustration approach to anomaly detection
spellingShingle Towards a general cellular frustration approach to anomaly detection
Gonçalves, Rodrigo Faria
Cellular frustration
Machine learning
Unsupervised learning
Anomaly detection
One-class support vector machines
Isolation forest
k-means
title_short Towards a general cellular frustration approach to anomaly detection
title_full Towards a general cellular frustration approach to anomaly detection
title_fullStr Towards a general cellular frustration approach to anomaly detection
title_full_unstemmed Towards a general cellular frustration approach to anomaly detection
title_sort Towards a general cellular frustration approach to anomaly detection
author Gonçalves, Rodrigo Faria
author_facet Gonçalves, Rodrigo Faria
author_role author
dc.contributor.author.fl_str_mv Gonçalves, Rodrigo Faria
dc.subject.por.fl_str_mv Cellular frustration
Machine learning
Unsupervised learning
Anomaly detection
One-class support vector machines
Isolation forest
k-means
topic Cellular frustration
Machine learning
Unsupervised learning
Anomaly detection
One-class support vector machines
Isolation forest
k-means
description Cellular Frustrated Systems model the interactions between presenter and detector agents, with the goal of detecting anomalies in data sets. These two types of agents follow a frustrated dynamic (i.e., unstable), in which they continuously change the agent of the other type they are paired with, when a normal sample is presented by the presenter agents. In order for CFSs to make detections, presenters must show abnormal samples that have one or more abnormal features, which leads the agents to pair for longer times, leading to stable pairs, hence detections. This work improves upon the previous versions of this model by allowing detectors to see two regions of feature space as abnormal, with a normal region of feature space in-between. The K-means clustering technique is also used to cluster data in data sets, so that detectors are able to partition the feature space and be assigned to a certain region which they will see as normal, with the rest being seen as abnormal. It is shown that there is no need to train separate populations of detectors in order to make detections with data sets that have abnormal samples in-between normal ones. The current version of the model is compared with previous versions of the Cellular Frustration model, and also with two well known anomaly detection methods, the One-Class Support Vector Machines, and the Isolation Forest. The results show that it has comparable performance in relation to the competing methods, whereas regarding the previous versions, it is able to achieve the same results while being a more robust model applicable in more situations with less effort. Finally, some ideas for future work are discussed in order to further improve the model, which still has some issues when certain unfavorable conditions arise in data sets.
publishDate 2021
dc.date.none.fl_str_mv 2021-12-10T00:00:00Z
2021-12-10
2022-06-29T09:44:09Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/34080
url http://hdl.handle.net/10773/34080
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137709479952384