A predictive model for employee attrition

Detalhes bibliográficos
Autor(a) principal: Gomes, Adriano de Oliveira
Data de Publicação: 2022
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/35942
Resumo: Employee attrition is currently a major concern for companies as loosing highly qualified personnel has tremendous impacts in several aspects of its daily life. Being able to predict or anticipate attritors is highly valuable in the era of industry 4.0, as it can avoid needless costs. This dissertation proposes an approach to build a database and a predictive model for attrition that includes data from multiple companies from different sectors. Several standards and data definitions are proposed to ease collection and data fusion from different sources, resulting in a database on which a predictive model can be trained. This dissertation’s proposal deals with attrition by considering 3 classes (voluntary, involuntary and no attritors) and several machine learning models were tested to solve the problem. It was found that boosting models stand out as the best performing ones. A XGBoost model evaluated on a 20-run experiment achieved an overall mean accuracy of 78.5%, corresponding to 52.6% of the voluntary attritors, 78.9% of the involuntary attritors and 81.6% of the non-attritors, showing that voluntary attritors are harder to discriminate. For these results, the contract type, area of work in the company or salary rate have shown to be the most important factors contributing to attrition.
id RCAP_2d7f51c106d1e7881d56f2c6fc55f885
oai_identifier_str oai:ria.ua.pt:10773/35942
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A predictive model for employee attritionEmployee attritionMachine learningReal dataHuman resources managementIndustry 4.0Employee attrition is currently a major concern for companies as loosing highly qualified personnel has tremendous impacts in several aspects of its daily life. Being able to predict or anticipate attritors is highly valuable in the era of industry 4.0, as it can avoid needless costs. This dissertation proposes an approach to build a database and a predictive model for attrition that includes data from multiple companies from different sectors. Several standards and data definitions are proposed to ease collection and data fusion from different sources, resulting in a database on which a predictive model can be trained. This dissertation’s proposal deals with attrition by considering 3 classes (voluntary, involuntary and no attritors) and several machine learning models were tested to solve the problem. It was found that boosting models stand out as the best performing ones. A XGBoost model evaluated on a 20-run experiment achieved an overall mean accuracy of 78.5%, corresponding to 52.6% of the voluntary attritors, 78.9% of the involuntary attritors and 81.6% of the non-attritors, showing that voluntary attritors are harder to discriminate. For these results, the contract type, area of work in the company or salary rate have shown to be the most important factors contributing to attrition.empresas, uma vez que a perda de pessoal altamente qualificado tem enormes impactos em vários aspetos da sua vida diária. Ser capaz de prever ou antecipar a saída de funcionários é altamente valioso na era da Indústria 4.0, visto que pode evitar custos desnecessários. Esta dissertação propõe uma abordagem para construir uma base de dados, definindo variáveis relevantes, e um modelo preditivo de attrition que inclui dados de várias empresas, de diferentes setores. Vários padrões e definições de dados são propostos para facilitar a recolha e junção de dados de diferentes fontes, resultando numa base de dados com a qual se pode treinar um modelo preditivo. A proposta desta dissertação trata do attrition considerando 3 classes (voluntário, involuntário e não-attrition) e vários modelos de aprendizagem automática foram testados para procurar uma solução para o problema. Constatou-se que os modelos de boosting se destacam por terem melhor desempenho. Um modelo XGBoost avaliado numa experiência composta por 20 execuções alcançou uma precisão (accuracy) média de 78.5%, correspondendo a 52.6% dos funcionários em attrition voluntário, 78.9% em attrition involuntário e 81.6% em não-attrition, mostrando que os funcionários em situação de attrition voluntário são mais difíceis de discriminar. Para estes resultados, o tipo de contrato, a área de atuação na empresa ou a taxa salarial mostraram-se como os fatores mais importantes que contribuem para o attrition.2024-12-20T00:00:00Z2022-12-12T00:00:00Z2022-12-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/35942engGomes, Adriano de Oliveirainfo:eu-repo/semantics/embargoedAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T12:09:28Zoai:ria.ua.pt:10773/35942Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:06:57.306100Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A predictive model for employee attrition
title A predictive model for employee attrition
spellingShingle A predictive model for employee attrition
Gomes, Adriano de Oliveira
Employee attrition
Machine learning
Real data
Human resources management
Industry 4.0
title_short A predictive model for employee attrition
title_full A predictive model for employee attrition
title_fullStr A predictive model for employee attrition
title_full_unstemmed A predictive model for employee attrition
title_sort A predictive model for employee attrition
author Gomes, Adriano de Oliveira
author_facet Gomes, Adriano de Oliveira
author_role author
dc.contributor.author.fl_str_mv Gomes, Adriano de Oliveira
dc.subject.por.fl_str_mv Employee attrition
Machine learning
Real data
Human resources management
Industry 4.0
topic Employee attrition
Machine learning
Real data
Human resources management
Industry 4.0
description Employee attrition is currently a major concern for companies as loosing highly qualified personnel has tremendous impacts in several aspects of its daily life. Being able to predict or anticipate attritors is highly valuable in the era of industry 4.0, as it can avoid needless costs. This dissertation proposes an approach to build a database and a predictive model for attrition that includes data from multiple companies from different sectors. Several standards and data definitions are proposed to ease collection and data fusion from different sources, resulting in a database on which a predictive model can be trained. This dissertation’s proposal deals with attrition by considering 3 classes (voluntary, involuntary and no attritors) and several machine learning models were tested to solve the problem. It was found that boosting models stand out as the best performing ones. A XGBoost model evaluated on a 20-run experiment achieved an overall mean accuracy of 78.5%, corresponding to 52.6% of the voluntary attritors, 78.9% of the involuntary attritors and 81.6% of the non-attritors, showing that voluntary attritors are harder to discriminate. For these results, the contract type, area of work in the company or salary rate have shown to be the most important factors contributing to attrition.
publishDate 2022
dc.date.none.fl_str_mv 2022-12-12T00:00:00Z
2022-12-12
2024-12-20T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/35942
url http://hdl.handle.net/10773/35942
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/embargoedAccess
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137724913942528