Document Clustering as an approach to template extraction
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/135877 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence |
id |
RCAP_339452f9787bbad0d4ed4878b4e41a3e |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/135877 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
|
spelling |
Document Clustering as an approach to template extractionDocument ClusteringSimilarity MeasuresText RepresentationTemplateNatural Language ProcessingDissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceA great part of customer support is done via the exchange of emails. As the number of emails exchanged daily is constantly increasing, companies need to find approaches to ensure its efficiency. One common strategy is the usage of template emails as an answer. These answers templates are usually found by a human agent through the repetitive usage of the same answer. In this work, we use a clustering approach to find these answer templates. Several clustering algorithms are researched in this work, with a focus on the k-means methodology, as well as other clustering components such as similarity measures and pre-processing steps. As we are dealing with text data, several text representation methods are also compared. Due to the peculiarity of the provided data, we are able to design methodologies to ensure the feasibility of this task and develop strategies to extract the answer templates from the clustering results.Almeida, Mariana Sá Correia Leite deRei, Ricardo Costa DiasRUNRodrigues, André Miguel Fernandes2022-04-05T15:46:03Z2022-04-012022-04-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/135877TID:202988228enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-10T16:06:20ZPortal AgregadorONG |
dc.title.none.fl_str_mv |
Document Clustering as an approach to template extraction |
title |
Document Clustering as an approach to template extraction |
spellingShingle |
Document Clustering as an approach to template extraction Rodrigues, André Miguel Fernandes Document Clustering Similarity Measures Text Representation Template Natural Language Processing |
title_short |
Document Clustering as an approach to template extraction |
title_full |
Document Clustering as an approach to template extraction |
title_fullStr |
Document Clustering as an approach to template extraction |
title_full_unstemmed |
Document Clustering as an approach to template extraction |
title_sort |
Document Clustering as an approach to template extraction |
author |
Rodrigues, André Miguel Fernandes |
author_facet |
Rodrigues, André Miguel Fernandes |
author_role |
author |
dc.contributor.none.fl_str_mv |
Almeida, Mariana Sá Correia Leite de Rei, Ricardo Costa Dias RUN |
dc.contributor.author.fl_str_mv |
Rodrigues, André Miguel Fernandes |
dc.subject.por.fl_str_mv |
Document Clustering Similarity Measures Text Representation Template Natural Language Processing |
topic |
Document Clustering Similarity Measures Text Representation Template Natural Language Processing |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-04-05T15:46:03Z 2022-04-01 2022-04-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/135877 TID:202988228 |
url |
http://hdl.handle.net/10362/135877 |
identifier_str_mv |
TID:202988228 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
|
repository.mail.fl_str_mv |
|
_version_ |
1777303062818324480 |