Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/119831 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence |
id |
RCAP_19b0d3b162f67d6c597d2d4ba63f8581 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/119831 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into ZonesNatural Language Processing;Machine Learning;Email Zoning;Text Segmentation;Customer ServiceMultilingual;Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe segmentation of emails into functional zones (also dubbed email zoning) is a relevant preprocessing step for most NLP tasks that deal with emails. In this research, we analyze in depth the email zoning literature and develop a business case around CLEVERLY AI, a company from the Customer Service sector. We design a new email zoning classification schema and collect a multilingual corpus of emails from CLEVERLY AI clients. We develop five neural network-based email zoning systems, among those systems, we introduce OKAPI, the first multilingual email zoning model based on a language agnostic sentence encoder. Besides outperforming our other systems when tested on CLEVERLY’s emails, OKAPI shows competitive performances with current English public benchmarks and reached new state-of-the-art results for English domain adaptation tasks. Moreover, we release a new multilingual benchmark, composed of 625 emails in Portuguese, Spanish and French, and demonstrate OKAPI can effectively generalize its learnings for unseen languages.Almeida, Mariana Sá Correia Leite deRei, Ricardo Costa DiasRUNJardim, João Bruno Morais de Sousa2021-06-23T14:02:11Z2021-06-072021-06-07T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/119831TID:202775305enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-22T17:54:07Zoai:run.unl.pt:10362/119831Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-22T17:54:07Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
title |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
spellingShingle |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones Jardim, João Bruno Morais de Sousa Natural Language Processing; Machine Learning; Email Zoning; Text Segmentation; Customer Service Multilingual; |
title_short |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
title_full |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
title_fullStr |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
title_full_unstemmed |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
title_sort |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones |
author |
Jardim, João Bruno Morais de Sousa |
author_facet |
Jardim, João Bruno Morais de Sousa |
author_role |
author |
dc.contributor.none.fl_str_mv |
Almeida, Mariana Sá Correia Leite de Rei, Ricardo Costa Dias RUN |
dc.contributor.author.fl_str_mv |
Jardim, João Bruno Morais de Sousa |
dc.subject.por.fl_str_mv |
Natural Language Processing; Machine Learning; Email Zoning; Text Segmentation; Customer Service Multilingual; |
topic |
Natural Language Processing; Machine Learning; Email Zoning; Text Segmentation; Customer Service Multilingual; |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-06-23T14:02:11Z 2021-06-07 2021-06-07T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/119831 TID:202775305 |
url |
http://hdl.handle.net/10362/119831 |
identifier_str_mv |
TID:202775305 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
mluisa.alvim@gmail.com |
_version_ |
1817545806251556864 |