Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones

Detalhes bibliográficos
Autor(a) principal: Jardim, João Bruno Morais de Sousa
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/119831
Resumo: Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence
id RCAP_19b0d3b162f67d6c597d2d4ba63f8581
oai_identifier_str oai:run.unl.pt:10362/119831
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Multilingual Email Zoning - Segmenting Multilingual Email Text Into ZonesNatural Language Processing;Machine Learning;Email Zoning;Text Segmentation;Customer ServiceMultilingual;Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe segmentation of emails into functional zones (also dubbed email zoning) is a relevant preprocessing step for most NLP tasks that deal with emails. In this research, we analyze in depth the email zoning literature and develop a business case around CLEVERLY AI, a company from the Customer Service sector. We design a new email zoning classification schema and collect a multilingual corpus of emails from CLEVERLY AI clients. We develop five neural network-based email zoning systems, among those systems, we introduce OKAPI, the first multilingual email zoning model based on a language agnostic sentence encoder. Besides outperforming our other systems when tested on CLEVERLY’s emails, OKAPI shows competitive performances with current English public benchmarks and reached new state-of-the-art results for English domain adaptation tasks. Moreover, we release a new multilingual benchmark, composed of 625 emails in Portuguese, Spanish and French, and demonstrate OKAPI can effectively generalize its learnings for unseen languages.Almeida, Mariana Sá Correia Leite deRei, Ricardo Costa DiasRUNJardim, João Bruno Morais de Sousa2021-06-23T14:02:11Z2021-06-072021-06-07T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/119831TID:202775305enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:02:25Zoai:run.unl.pt:10362/119831Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:44:12.395153Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
title Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
spellingShingle Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
Jardim, João Bruno Morais de Sousa
Natural Language Processing;
Machine Learning;
Email Zoning;
Text Segmentation;
Customer Service
Multilingual;
title_short Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
title_full Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
title_fullStr Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
title_full_unstemmed Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
title_sort Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
author Jardim, João Bruno Morais de Sousa
author_facet Jardim, João Bruno Morais de Sousa
author_role author
dc.contributor.none.fl_str_mv Almeida, Mariana Sá Correia Leite de
Rei, Ricardo Costa Dias
RUN
dc.contributor.author.fl_str_mv Jardim, João Bruno Morais de Sousa
dc.subject.por.fl_str_mv Natural Language Processing;
Machine Learning;
Email Zoning;
Text Segmentation;
Customer Service
Multilingual;
topic Natural Language Processing;
Machine Learning;
Email Zoning;
Text Segmentation;
Customer Service
Multilingual;
description Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence
publishDate 2021
dc.date.none.fl_str_mv 2021-06-23T14:02:11Z
2021-06-07
2021-06-07T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/119831
TID:202775305
url http://hdl.handle.net/10362/119831
identifier_str_mv TID:202775305
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138049976696832