Text Mining Research Project: Internship at Ageas Portugal
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/128809 |
Resumo: | Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
id |
RCAP_a33fa14096336c20b4a2dd88cc6a1093 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/128809 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Text Mining Research Project: Internship at Ageas PortugalText miningText analyticsNatural language processingSentiment analysisTopic classificationInternship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAs an insurance company, Ageas Portugal has lots of data related to their customers. Usually, most of data used by companies (disregarding few companies that already use advanced machine learning and artificial intelligence techniques) are structured data, that are known as formatted datasets and tables with customer information. But, with the advance of technology, more companies are starting to use their unstructured data, which could be helpful to find insights and achieve goals. From the different data sources in human language form the company has as emails, customer surveys, medical transcriptions and etc., we have agreed an email database would be the best option for the project development. This type of data requires a very thorough data preparation as there are irrelevant parts within emails as signatures and disclaimers, which should be excluded. Analyzing customer’s interaction with the company we could find insights about how to increase sales and reduce churn rate. We have applied two Text Mining techniques (Sentiment Analysis and Topic Classification) and a proof of concept was conducted. It showed that clients who send or are mentioned in emails tend to cancel their policies at higher rate than those without emails, even if the email’s topic is not related to cancellation. It has also showed that the effect of sentiment on cancellations behavior appears to be mixed, requiring further analysis. The full project was developed in Python but there was also a comparison with other market solutions as Amazon Web Services, SAS, Google Cloud and Microsoft Azure, in order to find the best Text Mining tool to fit with the company. As expected, Python was elected as the best option.Pinheiro, Flávio Luís PortasRUNTeixeira, Daniel Rocha2021-12-07T16:58:16Z2021-11-262021-11-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/128809TID:202809668enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:08:12Zoai:run.unl.pt:10362/128809Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:46:24.516877Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Text Mining Research Project: Internship at Ageas Portugal |
title |
Text Mining Research Project: Internship at Ageas Portugal |
spellingShingle |
Text Mining Research Project: Internship at Ageas Portugal Teixeira, Daniel Rocha Text mining Text analytics Natural language processing Sentiment analysis Topic classification |
title_short |
Text Mining Research Project: Internship at Ageas Portugal |
title_full |
Text Mining Research Project: Internship at Ageas Portugal |
title_fullStr |
Text Mining Research Project: Internship at Ageas Portugal |
title_full_unstemmed |
Text Mining Research Project: Internship at Ageas Portugal |
title_sort |
Text Mining Research Project: Internship at Ageas Portugal |
author |
Teixeira, Daniel Rocha |
author_facet |
Teixeira, Daniel Rocha |
author_role |
author |
dc.contributor.none.fl_str_mv |
Pinheiro, Flávio Luís Portas RUN |
dc.contributor.author.fl_str_mv |
Teixeira, Daniel Rocha |
dc.subject.por.fl_str_mv |
Text mining Text analytics Natural language processing Sentiment analysis Topic classification |
topic |
Text mining Text analytics Natural language processing Sentiment analysis Topic classification |
description |
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-12-07T16:58:16Z 2021-11-26 2021-11-26T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/128809 TID:202809668 |
url |
http://hdl.handle.net/10362/128809 |
identifier_str_mv |
TID:202809668 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138067745865728 |