Trace clustering approach for detection and locatization of concept drift in business processes
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/100/100131/tde-02122021-144602/ |
Resumo: | Business processes are constantly subject to changes over time due to the need for adaptation and flexibility in the complex environment they operate, such as new clients demands, competition, or legislation. Process models are one of the fundamental tools when understanding a process behavior, which is key for business success. However, these process models are usually not documented and updated to agree with eventual changes in process behavior over time, leading to misconceptions in the understanding of the actual process. Although process mining aims to provide techniques that discover, analyze, and enhance process automatically based on event logs, most techniques assume that the process is stationary, which is not often the case. Handling the problem of processes changing over time, known as concept drift, leads to the capability of detecting drift as soon as possible and localizing the entities involved in them, providing a much better comprehension of the process behavior that can be a competitive advantage for businesses. Most of the work on dealing with concept drift in the process mining literature focuses on providing a framework that is able to detect drifts, but are generally not adequate to simultaneously localize the change inside the process behavior and exhibit information on the entities involved. Applying clustering techniques to data from event logs, known as trace clustering, supports the identification of patterns in the process behavior that enable simplification and segregation of similar behaviors that produces a model of the process behavior as clusters. However, although common in general process mining, trace clustering has not been widely explored in the context of the concept drift problem. This research presents a method to simultaneously perform concept drift detection and localization based on the same clusters obtained by online trace clustering. The clusters are able to reflect changes in complex process behavior in a simplified manner that serves as a platform for performing effective drift detection and localization online with no additional data structures. Experiments with synthetic and real-world event logs with different types of control-flow changes have shown that, although our method has not outperformed the baseline for drift detection in all cases, our approach was able to correctly detect drifts in most cases according to parameters configuration while also providing information about the entities involved in the drift from the business process perspective |
id |
USP_9d2e88b9d2087e589f2ad10b8a5979e1 |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-02122021-144602 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
Trace clustering approach for detection and locatization of concept drift in business processesAbordagem baseada em agrupamento de traços para detecção e localização de concept drift em processos de negóciosAgrupamentoAgrupamento de traçosClusteringConcept driftConcept driftData miningData streamMineração de dadosMineração de processosProcess miningStream de dadosTrace clusteringBusiness processes are constantly subject to changes over time due to the need for adaptation and flexibility in the complex environment they operate, such as new clients demands, competition, or legislation. Process models are one of the fundamental tools when understanding a process behavior, which is key for business success. However, these process models are usually not documented and updated to agree with eventual changes in process behavior over time, leading to misconceptions in the understanding of the actual process. Although process mining aims to provide techniques that discover, analyze, and enhance process automatically based on event logs, most techniques assume that the process is stationary, which is not often the case. Handling the problem of processes changing over time, known as concept drift, leads to the capability of detecting drift as soon as possible and localizing the entities involved in them, providing a much better comprehension of the process behavior that can be a competitive advantage for businesses. Most of the work on dealing with concept drift in the process mining literature focuses on providing a framework that is able to detect drifts, but are generally not adequate to simultaneously localize the change inside the process behavior and exhibit information on the entities involved. Applying clustering techniques to data from event logs, known as trace clustering, supports the identification of patterns in the process behavior that enable simplification and segregation of similar behaviors that produces a model of the process behavior as clusters. However, although common in general process mining, trace clustering has not been widely explored in the context of the concept drift problem. This research presents a method to simultaneously perform concept drift detection and localization based on the same clusters obtained by online trace clustering. The clusters are able to reflect changes in complex process behavior in a simplified manner that serves as a platform for performing effective drift detection and localization online with no additional data structures. Experiments with synthetic and real-world event logs with different types of control-flow changes have shown that, although our method has not outperformed the baseline for drift detection in all cases, our approach was able to correctly detect drifts in most cases according to parameters configuration while also providing information about the entities involved in the drift from the business process perspectiveProcessos de negócios estão constantemente em mudança ao longo do tempo devido à necessidade de adaptação e flexibilidade nos ambientes complexos em que eles operam, como novas demandas de clientes, competição ou legislação. Modelos de processos são uma das principais ferramentas utilizadas para se entender o funcionamento de processo. Entretanto, esses modelos costumam não serem documentados ou atualizados frequentemente para se adequarem a eventuais mudanças no comportamento do processo, gerando equívocos no entendimento do processo real. Embora a área de mineração de processos tem por objetivo desenvolver técnicas para descobrir, analisar e melhorar processos automaticamente a partir de logs de eventos, a maioria dos métodos assumem que o processo é estacionário, o que frequentemente não é o caso. Lidar com esse problema de processos variando ao longo do tempo, conhecido como concept drift, busca prover a capacidade de detectar o quanto antes um drift e localizar as entidades envolvidas nele, levando a uma compreensão muito melhor do processo de negócio em questão, o que pode ser uma vantagem competitiva para o negócio. A maioria dos trabalhos que lida com concept drift em processos de negócios foca no desenvolvimento de métodos capazes de realizar a detecção de drifts, mas geralmente não são capazes de simultaneamente localizar o drift dentro do comportamento do processo e revelar as entidades envolvidas no drift. Aplicando técnicas de clusterização em dados de logs de eventos, conhecido como trace clustering, é possível identificar padrões dentro do comportamento do processo que possibilitam a simplificação e segregação de comportamentos similares que produzem um modelo que representa o comportamento do processo em forma de clusters. Contudo, embora comum em mineração de processos no geral, trace clustering ainda não foi amplamente explorado dentro do contexto de concept drift. Este trabalho apresenta um método que realiza detecção e localização de drifts simultaneamente de forma online baseada nos mesmos clusters obtidos por trace clustering. Esses clusters são capazes de refletir as mudanças que ocorrem em processos de comportamento complexo de forma simplificada, servindo como uma plataforma para a execução de detecção e localização de drifts de forma efetiva sem necessidade de estrutura de dados adicional. Experimentos com logs de eventos sintéticos e do mundo real com diferentes tipos de mudanças de fluxo-controle demonstram que, embora nosso método não tenha superado o baseline em todos os casos na tarefa de detecção, nossa abordagem foi capaz de detectar drifts corretamente na maioria dos casos de acordo com a configuração de parâmetros utilizada, enquanto também foi capaz de prover informações sobre as entidades envolvidas no drift do ponto de vista do processo de negócioBiblioteca Digitais de Teses e Dissertações da USPPeres, Sarajane MarquesReijers, Hajo AlexanderSousa, Rafael Gaspar de2021-10-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/100/100131/tde-02122021-144602/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-10-09T13:16:04Zoai:teses.usp.br:tde-02122021-144602Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212024-10-09T13:16:04Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Trace clustering approach for detection and locatization of concept drift in business processes Abordagem baseada em agrupamento de traços para detecção e localização de concept drift em processos de negócios |
title |
Trace clustering approach for detection and locatization of concept drift in business processes |
spellingShingle |
Trace clustering approach for detection and locatization of concept drift in business processes Sousa, Rafael Gaspar de Agrupamento Agrupamento de traços Clustering Concept drift Concept drift Data mining Data stream Mineração de dados Mineração de processos Process mining Stream de dados Trace clustering |
title_short |
Trace clustering approach for detection and locatization of concept drift in business processes |
title_full |
Trace clustering approach for detection and locatization of concept drift in business processes |
title_fullStr |
Trace clustering approach for detection and locatization of concept drift in business processes |
title_full_unstemmed |
Trace clustering approach for detection and locatization of concept drift in business processes |
title_sort |
Trace clustering approach for detection and locatization of concept drift in business processes |
author |
Sousa, Rafael Gaspar de |
author_facet |
Sousa, Rafael Gaspar de |
author_role |
author |
dc.contributor.none.fl_str_mv |
Peres, Sarajane Marques Reijers, Hajo Alexander |
dc.contributor.author.fl_str_mv |
Sousa, Rafael Gaspar de |
dc.subject.por.fl_str_mv |
Agrupamento Agrupamento de traços Clustering Concept drift Concept drift Data mining Data stream Mineração de dados Mineração de processos Process mining Stream de dados Trace clustering |
topic |
Agrupamento Agrupamento de traços Clustering Concept drift Concept drift Data mining Data stream Mineração de dados Mineração de processos Process mining Stream de dados Trace clustering |
description |
Business processes are constantly subject to changes over time due to the need for adaptation and flexibility in the complex environment they operate, such as new clients demands, competition, or legislation. Process models are one of the fundamental tools when understanding a process behavior, which is key for business success. However, these process models are usually not documented and updated to agree with eventual changes in process behavior over time, leading to misconceptions in the understanding of the actual process. Although process mining aims to provide techniques that discover, analyze, and enhance process automatically based on event logs, most techniques assume that the process is stationary, which is not often the case. Handling the problem of processes changing over time, known as concept drift, leads to the capability of detecting drift as soon as possible and localizing the entities involved in them, providing a much better comprehension of the process behavior that can be a competitive advantage for businesses. Most of the work on dealing with concept drift in the process mining literature focuses on providing a framework that is able to detect drifts, but are generally not adequate to simultaneously localize the change inside the process behavior and exhibit information on the entities involved. Applying clustering techniques to data from event logs, known as trace clustering, supports the identification of patterns in the process behavior that enable simplification and segregation of similar behaviors that produces a model of the process behavior as clusters. However, although common in general process mining, trace clustering has not been widely explored in the context of the concept drift problem. This research presents a method to simultaneously perform concept drift detection and localization based on the same clusters obtained by online trace clustering. The clusters are able to reflect changes in complex process behavior in a simplified manner that serves as a platform for performing effective drift detection and localization online with no additional data structures. Experiments with synthetic and real-world event logs with different types of control-flow changes have shown that, although our method has not outperformed the baseline for drift detection in all cases, our approach was able to correctly detect drifts in most cases according to parameters configuration while also providing information about the entities involved in the drift from the business process perspective |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-10-28 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/100/100131/tde-02122021-144602/ |
url |
https://www.teses.usp.br/teses/disponiveis/100/100131/tde-02122021-144602/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1815256482910830592 |