Text Mining Research Project: Internship at Ageas Portugal

Detalhes bibliográficos
Autor(a) principal: Teixeira, Daniel Rocha
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/128809
Resumo: Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
id RCAP_a33fa14096336c20b4a2dd88cc6a1093
oai_identifier_str oai:run.unl.pt:10362/128809
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Text Mining Research Project: Internship at Ageas PortugalText miningText analyticsNatural language processingSentiment analysisTopic classificationInternship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAs an insurance company, Ageas Portugal has lots of data related to their customers. Usually, most of data used by companies (disregarding few companies that already use advanced machine learning and artificial intelligence techniques) are structured data, that are known as formatted datasets and tables with customer information. But, with the advance of technology, more companies are starting to use their unstructured data, which could be helpful to find insights and achieve goals. From the different data sources in human language form the company has as emails, customer surveys, medical transcriptions and etc., we have agreed an email database would be the best option for the project development. This type of data requires a very thorough data preparation as there are irrelevant parts within emails as signatures and disclaimers, which should be excluded. Analyzing customer’s interaction with the company we could find insights about how to increase sales and reduce churn rate. We have applied two Text Mining techniques (Sentiment Analysis and Topic Classification) and a proof of concept was conducted. It showed that clients who send or are mentioned in emails tend to cancel their policies at higher rate than those without emails, even if the email’s topic is not related to cancellation. It has also showed that the effect of sentiment on cancellations behavior appears to be mixed, requiring further analysis. The full project was developed in Python but there was also a comparison with other market solutions as Amazon Web Services, SAS, Google Cloud and Microsoft Azure, in order to find the best Text Mining tool to fit with the company. As expected, Python was elected as the best option.Pinheiro, Flávio Luís PortasRUNTeixeira, Daniel Rocha2021-12-07T16:58:16Z2021-11-262021-11-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/128809TID:202809668enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:08:12Zoai:run.unl.pt:10362/128809Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:46:24.516877Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Text Mining Research Project: Internship at Ageas Portugal
title Text Mining Research Project: Internship at Ageas Portugal
spellingShingle Text Mining Research Project: Internship at Ageas Portugal
Teixeira, Daniel Rocha
Text mining
Text analytics
Natural language processing
Sentiment analysis
Topic classification
title_short Text Mining Research Project: Internship at Ageas Portugal
title_full Text Mining Research Project: Internship at Ageas Portugal
title_fullStr Text Mining Research Project: Internship at Ageas Portugal
title_full_unstemmed Text Mining Research Project: Internship at Ageas Portugal
title_sort Text Mining Research Project: Internship at Ageas Portugal
author Teixeira, Daniel Rocha
author_facet Teixeira, Daniel Rocha
author_role author
dc.contributor.none.fl_str_mv Pinheiro, Flávio Luís Portas
RUN
dc.contributor.author.fl_str_mv Teixeira, Daniel Rocha
dc.subject.por.fl_str_mv Text mining
Text analytics
Natural language processing
Sentiment analysis
Topic classification
topic Text mining
Text analytics
Natural language processing
Sentiment analysis
Topic classification
description Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
publishDate 2021
dc.date.none.fl_str_mv 2021-12-07T16:58:16Z
2021-11-26
2021-11-26T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/128809
TID:202809668
url http://hdl.handle.net/10362/128809
identifier_str_mv TID:202809668
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138067745865728