Text Mining - A Toolbox for Text Classification

Detalhes bibliográficos
Autor(a) principal: Paulo Sérgio Vieira da Costa
Data de Publicação: 2020
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/127189
Resumo: In this thesis it will be explored in depth the process of text mining and further document classification. The main focus will be the development of a platform capable of achieving op- erations of data extraction, natural language processing, classification of data, and evaluation of constructed models, from a corpus of labeled documents. This will be integrated with a sentiment analysis dataset where the documents are polarity based, classified as positive or negative. It will be made an evaluation of the accuracy in the processing algorithms and an in depth comparison between the different ones used although this process. It was aimed to produce a user-friendly application, capable of providing the user with tools of text mining and predictive analysis with the integration of a polarity dataset.
id RCAP_5cbda507853ac6ff56a312daee7d3df1
oai_identifier_str oai:repositorio-aberto.up.pt:10216/127189
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Text Mining - A Toolbox for Text ClassificationEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringIn this thesis it will be explored in depth the process of text mining and further document classification. The main focus will be the development of a platform capable of achieving op- erations of data extraction, natural language processing, classification of data, and evaluation of constructed models, from a corpus of labeled documents. This will be integrated with a sentiment analysis dataset where the documents are polarity based, classified as positive or negative. It will be made an evaluation of the accuracy in the processing algorithms and an in depth comparison between the different ones used although this process. It was aimed to produce a user-friendly application, capable of providing the user with tools of text mining and predictive analysis with the integration of a polarity dataset.2020-02-192020-02-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/127189TID:202824403engPaulo Sérgio Vieira da Costainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T14:07:18Zoai:repositorio-aberto.up.pt:10216/127189Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:55:20.808185Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Text Mining - A Toolbox for Text Classification
title Text Mining - A Toolbox for Text Classification
spellingShingle Text Mining - A Toolbox for Text Classification
Paulo Sérgio Vieira da Costa
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Text Mining - A Toolbox for Text Classification
title_full Text Mining - A Toolbox for Text Classification
title_fullStr Text Mining - A Toolbox for Text Classification
title_full_unstemmed Text Mining - A Toolbox for Text Classification
title_sort Text Mining - A Toolbox for Text Classification
author Paulo Sérgio Vieira da Costa
author_facet Paulo Sérgio Vieira da Costa
author_role author
dc.contributor.author.fl_str_mv Paulo Sérgio Vieira da Costa
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description In this thesis it will be explored in depth the process of text mining and further document classification. The main focus will be the development of a platform capable of achieving op- erations of data extraction, natural language processing, classification of data, and evaluation of constructed models, from a corpus of labeled documents. This will be integrated with a sentiment analysis dataset where the documents are polarity based, classified as positive or negative. It will be made an evaluation of the accuracy in the processing algorithms and an in depth comparison between the different ones used although this process. It was aimed to produce a user-friendly application, capable of providing the user with tools of text mining and predictive analysis with the integration of a polarity dataset.
publishDate 2020
dc.date.none.fl_str_mv 2020-02-19
2020-02-19T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/127189
TID:202824403
url https://hdl.handle.net/10216/127189
identifier_str_mv TID:202824403
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135873882652673