Software failure prediction based on patterns of multiple-event failures

Detalhes bibliográficos
Autor(a) principal: Santos, Caio Augusto Rodrigues dos
Data de Publicação: 2021
Tipo de documento: Tese
Idioma: eng
Título da fonte: Repositório Institucional da UFU
Texto Completo: https://repositorio.ufu.br/handle/123456789/31537
http://doi.org/10.14393/ufu.te.2021.202
Resumo: A fundamental need for software reliability engineering is to comprehend how software systems fail, which means understanding the dynamics that govern different types of failure manifestation. In this research, I present an exploratory study on multiple-event failures, which is a failure manifestation characterized by sequences of failure events, varying in terms of length, duration, and combination of failure types. This study aims to (i) improve the understanding of multiple-event failures in real software systems, investigating their occurrences, associations, and causes; (ii) propose analysis protocols that take into account multiple-event failure manifestations; (iii) take advantage of the sequential nature of this type of software failure to perform predictions. The failures analyzed in this research were observed empirically. In total, I analyzed 42,209 real software failures from 644 computers used in different workplaces. The major contributions of this study are a protocol developed to investigate the existence of patterns of failure associations; a protocol to discover patterns of failure sequences; and a prediction approach whose main concept is to calculate the probability of a certain failure event to occur within a time interval upon the occurrence of a particular pattern of preceding failures. I used three methods to tackle the prediction problem; Multinomial Logistic Regression (w/ and w/o Ridge regularization), Decision Tree, and Random Forest. These methods were chosen due to the nature of the failure data, in which the failure types must be handled as categorical variables. Initially, I performed a failure association discovery analysis which only included failures from a widely used commercial off-the-shelf Operating System (OS). As a result, I discovered 45 OS failure association patterns with 153,511 occurrences, which were composed of the same or different failure types and occurring within well-established time intervals, systematically. The observed associations suggest the existence of underlying mechanisms governing these failure occurrences, which motivated the improvement of the previous method by creating a protocol to discover patterns of failure sequences using flexible time thresholds and a failure prediction approach. To have a comprehensive view of how different software failures may affect each other, both methods were applied to three different samples — the first sample contained only OS failures, the second contained only User Application failures, and the third encompassed both OS and User Application failures altogether. As a result, I found 165, 480, and 640 different failure sequences with thousands of occurrences, respectively. Finally, the proposed approach was able to predict failures with good to high accuracy (86% to 93%).
id UFU_17e1a923d60a39e2023ebd05244ebf80
oai_identifier_str oai:repositorio.ufu.br:123456789/31537
network_acronym_str UFU
network_name_str Repositório Institucional da UFU
repository_id_str
spelling Software failure prediction based on patterns of multiple-event failuresSoftware failure prediction based on patterns of multiple-event failuresFalhas de softwareAssociações de falhaSequências de falhaFalhas de múltiplos eventosPadrõesPrediçãoSoftware failuresFailure associationsFailure sequencesMultiple-event failuresPatternsPredictionCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAOA fundamental need for software reliability engineering is to comprehend how software systems fail, which means understanding the dynamics that govern different types of failure manifestation. In this research, I present an exploratory study on multiple-event failures, which is a failure manifestation characterized by sequences of failure events, varying in terms of length, duration, and combination of failure types. This study aims to (i) improve the understanding of multiple-event failures in real software systems, investigating their occurrences, associations, and causes; (ii) propose analysis protocols that take into account multiple-event failure manifestations; (iii) take advantage of the sequential nature of this type of software failure to perform predictions. The failures analyzed in this research were observed empirically. In total, I analyzed 42,209 real software failures from 644 computers used in different workplaces. The major contributions of this study are a protocol developed to investigate the existence of patterns of failure associations; a protocol to discover patterns of failure sequences; and a prediction approach whose main concept is to calculate the probability of a certain failure event to occur within a time interval upon the occurrence of a particular pattern of preceding failures. I used three methods to tackle the prediction problem; Multinomial Logistic Regression (w/ and w/o Ridge regularization), Decision Tree, and Random Forest. These methods were chosen due to the nature of the failure data, in which the failure types must be handled as categorical variables. Initially, I performed a failure association discovery analysis which only included failures from a widely used commercial off-the-shelf Operating System (OS). As a result, I discovered 45 OS failure association patterns with 153,511 occurrences, which were composed of the same or different failure types and occurring within well-established time intervals, systematically. The observed associations suggest the existence of underlying mechanisms governing these failure occurrences, which motivated the improvement of the previous method by creating a protocol to discover patterns of failure sequences using flexible time thresholds and a failure prediction approach. To have a comprehensive view of how different software failures may affect each other, both methods were applied to three different samples — the first sample contained only OS failures, the second contained only User Application failures, and the third encompassed both OS and User Application failures altogether. As a result, I found 165, 480, and 640 different failure sequences with thousands of occurrences, respectively. Finally, the proposed approach was able to predict failures with good to high accuracy (86% to 93%).CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorTese (Doutorado)Uma necessidade fundamental para a engenharia de confiabilidade de software é compreender como os sistemas de software falham, que significa entender a dinâmica que governa os diferentes tipos de manifestação de falha. Esta pesquisa apresenta um estudo exploratório sobre falhas de múltiplos eventos, que é uma manifestação de falha caracterizada por sequências de eventos de falha que variam em comprimento, duração e combinação de tipos de falha. Este estudo visa (i) melhorar a compreensão das falhas de múltiplos eventos em sistemas de software reais, investigando suas ocorrências, associações e causas; (ii) propor protocolos de análise que levem em consideração as manifestações de falha de múltiplos eventos; (iii) aproveitar a natureza sequencial desse tipo de falha de software para realizar previsões. As falhas analisadas nesta pesquisa foram observadas empiricamente. No total, foram analisadas 42.209 falhas reais de software de 644 computadores de diferentes locais de trabalho. As principais contribuições deste estudo são um protocolo desenvolvido para investigar a existência de padrões de associações de falha; um protocolo para descobrir padrões de sequências de falha; e uma abordagem de previsão cuja principal ideia é calcular a probabilidade de um determinado evento de falha ocorrer dentro de um intervalo de tempo após a ocorrência de um padrão particular de falhas anteriores. Três métodos foram utilizados para resolver o problema de previsão; Regressão Logística Multinomial (com ou sem regularização Ridge), Decision Tree e Random Forest. Tais métodos foram escolhidos devido à natureza dos dados de falha, nos quais os tipos de falha devem ser tratados como variáveis categóricas. Inicialmente, foi realizada uma análise de descoberta de associação de falhas que considerou apenas falhas de um sistema operacional (SO) comercial amplamente utilizado. Como resultado, foram descobertos 45 padrões de associação de falhas de sistema operacional com 153.511 ocorrências, compostos dos mesmos ou diferentes tipos de falha e ocorrendo, sistematicamente, em intervalos de tempo bem estabelecidos. As associações observadas sugerem a existência de mecanismos subjacentes que regem essas ocorrências de falha, o que motivou o aprimoramento do método anterior, com a criação de um protocolo para descobrir padrões de sequências de falhas usando limites de tempo flexíveis e uma abordagem de previsão de falha. Para ter uma visão abrangente de como as diferentes falhas de software podem afetar umas às outras, os dois métodos foram aplicados a três amostras diferentes — a primeira amostra contém apenas falhas do Sistema Operacional, a segunda contém apenas falhas de Aplicativos do Usuário e a terceira engloba falhas do Sistema Operacional e de Aplicativos de Usuário. Como resultado, foram encontradas 165, 480 e 640 sequências de falha diferentes com milhares de ocorrências, respectivamente. Por fim, a abordagem proposta foi capaz de prever falhas com boa até alta precisão (86% a 93%).Universidade Federal de UberlândiaBrasilPrograma de Pós-graduação em Ciência da ComputaçãoMatias Júnior, Rivalinohttp://lattes.cnpq.br/3034950214458518Trivedi, Kishor S.Andrzejak, ArturSilva, Dima daAlbertini, Marcelo KeeseSantos, Caio Augusto Rodrigues dos2021-04-07T10:55:09Z2021-04-07T10:55:09Z2021-03-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfSANTOS, Caio Augusto Rodrigues dos. Software Failure Prediction Based on Patterns of Multiple-Event Failures. 2021. 139 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal de Uberlândia, Uberlândia, 2021. DOI http://doi.org/10.14393/ufu.te.2021.202.https://repositorio.ufu.br/handle/123456789/31537http://doi.org/10.14393/ufu.te.2021.202enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFUinstname:Universidade Federal de Uberlândia (UFU)instacron:UFU2021-04-08T06:22:07Zoai:repositorio.ufu.br:123456789/31537Repositório InstitucionalONGhttp://repositorio.ufu.br/oai/requestdiinf@dirbi.ufu.bropendoar:2021-04-08T06:22:07Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)false
dc.title.none.fl_str_mv Software failure prediction based on patterns of multiple-event failures
Software failure prediction based on patterns of multiple-event failures
title Software failure prediction based on patterns of multiple-event failures
spellingShingle Software failure prediction based on patterns of multiple-event failures
Santos, Caio Augusto Rodrigues dos
Falhas de software
Associações de falha
Sequências de falha
Falhas de múltiplos eventos
Padrões
Predição
Software failures
Failure associations
Failure sequences
Multiple-event failures
Patterns
Prediction
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
title_short Software failure prediction based on patterns of multiple-event failures
title_full Software failure prediction based on patterns of multiple-event failures
title_fullStr Software failure prediction based on patterns of multiple-event failures
title_full_unstemmed Software failure prediction based on patterns of multiple-event failures
title_sort Software failure prediction based on patterns of multiple-event failures
author Santos, Caio Augusto Rodrigues dos
author_facet Santos, Caio Augusto Rodrigues dos
author_role author
dc.contributor.none.fl_str_mv Matias Júnior, Rivalino
http://lattes.cnpq.br/3034950214458518
Trivedi, Kishor S.
Andrzejak, Artur
Silva, Dima da
Albertini, Marcelo Keese
dc.contributor.author.fl_str_mv Santos, Caio Augusto Rodrigues dos
dc.subject.por.fl_str_mv Falhas de software
Associações de falha
Sequências de falha
Falhas de múltiplos eventos
Padrões
Predição
Software failures
Failure associations
Failure sequences
Multiple-event failures
Patterns
Prediction
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
topic Falhas de software
Associações de falha
Sequências de falha
Falhas de múltiplos eventos
Padrões
Predição
Software failures
Failure associations
Failure sequences
Multiple-event failures
Patterns
Prediction
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
description A fundamental need for software reliability engineering is to comprehend how software systems fail, which means understanding the dynamics that govern different types of failure manifestation. In this research, I present an exploratory study on multiple-event failures, which is a failure manifestation characterized by sequences of failure events, varying in terms of length, duration, and combination of failure types. This study aims to (i) improve the understanding of multiple-event failures in real software systems, investigating their occurrences, associations, and causes; (ii) propose analysis protocols that take into account multiple-event failure manifestations; (iii) take advantage of the sequential nature of this type of software failure to perform predictions. The failures analyzed in this research were observed empirically. In total, I analyzed 42,209 real software failures from 644 computers used in different workplaces. The major contributions of this study are a protocol developed to investigate the existence of patterns of failure associations; a protocol to discover patterns of failure sequences; and a prediction approach whose main concept is to calculate the probability of a certain failure event to occur within a time interval upon the occurrence of a particular pattern of preceding failures. I used three methods to tackle the prediction problem; Multinomial Logistic Regression (w/ and w/o Ridge regularization), Decision Tree, and Random Forest. These methods were chosen due to the nature of the failure data, in which the failure types must be handled as categorical variables. Initially, I performed a failure association discovery analysis which only included failures from a widely used commercial off-the-shelf Operating System (OS). As a result, I discovered 45 OS failure association patterns with 153,511 occurrences, which were composed of the same or different failure types and occurring within well-established time intervals, systematically. The observed associations suggest the existence of underlying mechanisms governing these failure occurrences, which motivated the improvement of the previous method by creating a protocol to discover patterns of failure sequences using flexible time thresholds and a failure prediction approach. To have a comprehensive view of how different software failures may affect each other, both methods were applied to three different samples — the first sample contained only OS failures, the second contained only User Application failures, and the third encompassed both OS and User Application failures altogether. As a result, I found 165, 480, and 640 different failure sequences with thousands of occurrences, respectively. Finally, the proposed approach was able to predict failures with good to high accuracy (86% to 93%).
publishDate 2021
dc.date.none.fl_str_mv 2021-04-07T10:55:09Z
2021-04-07T10:55:09Z
2021-03-29
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv SANTOS, Caio Augusto Rodrigues dos. Software Failure Prediction Based on Patterns of Multiple-Event Failures. 2021. 139 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal de Uberlândia, Uberlândia, 2021. DOI http://doi.org/10.14393/ufu.te.2021.202.
https://repositorio.ufu.br/handle/123456789/31537
http://doi.org/10.14393/ufu.te.2021.202
identifier_str_mv SANTOS, Caio Augusto Rodrigues dos. Software Failure Prediction Based on Patterns of Multiple-Event Failures. 2021. 139 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal de Uberlândia, Uberlândia, 2021. DOI http://doi.org/10.14393/ufu.te.2021.202.
url https://repositorio.ufu.br/handle/123456789/31537
http://doi.org/10.14393/ufu.te.2021.202
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Ciência da Computação
publisher.none.fl_str_mv Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Ciência da Computação
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFU
instname:Universidade Federal de Uberlândia (UFU)
instacron:UFU
instname_str Universidade Federal de Uberlândia (UFU)
instacron_str UFU
institution UFU
reponame_str Repositório Institucional da UFU
collection Repositório Institucional da UFU
repository.name.fl_str_mv Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)
repository.mail.fl_str_mv diinf@dirbi.ufu.br
_version_ 1813711435299028992