Ferramenta para Text Mining em Textos completos

Detalhes bibliográficos
Autor(a) principal: Hugo José Freixo Rodrigues
Data de Publicação: 2016
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/85394
Resumo: We live in a world in constant change where more and more people have unlimited access to the internet, where they can post their ideas. This means that there more and more texts available on the Web. These texts can be simple facebook posts or papers about various areas. With a large amount of texts, new techniques are needed to read and classify them quickly and effectively. Text Mining (TM) comes up to solve these problems. With this powerful tool it is possible to interpret and classify a huge amount of texts, in order to obtain useful information. Current TM approaches do not take advantage of structure of the texts. A text is seen as a bag of words, a set of unrelated words. This causes the TM algorithms become computationally heavy and their quality of information obtained can be substantially improved. There are different approaches of TM and several steps of preprocessing that can be applied to full text classification, but which methods that can get better results? Which steps must be applied so the final result is more complete? These are some questions that we address in this thesis.
id RCAP_ebdcaa8498d2b7082606e34cf4e98a6d
oai_identifier_str oai:repositorio-aberto.up.pt:10216/85394
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Ferramenta para Text Mining em Textos completosEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringWe live in a world in constant change where more and more people have unlimited access to the internet, where they can post their ideas. This means that there more and more texts available on the Web. These texts can be simple facebook posts or papers about various areas. With a large amount of texts, new techniques are needed to read and classify them quickly and effectively. Text Mining (TM) comes up to solve these problems. With this powerful tool it is possible to interpret and classify a huge amount of texts, in order to obtain useful information. Current TM approaches do not take advantage of structure of the texts. A text is seen as a bag of words, a set of unrelated words. This causes the TM algorithms become computationally heavy and their quality of information obtained can be substantially improved. There are different approaches of TM and several steps of preprocessing that can be applied to full text classification, but which methods that can get better results? Which steps must be applied so the final result is more complete? These are some questions that we address in this thesis.2016-07-272016-07-27T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/85394TID:201305011porHugo José Freixo Rodriguesinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T12:38:38Zoai:repositorio-aberto.up.pt:10216/85394Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:24:01.642971Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Ferramenta para Text Mining em Textos completos
title Ferramenta para Text Mining em Textos completos
spellingShingle Ferramenta para Text Mining em Textos completos
Hugo José Freixo Rodrigues
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Ferramenta para Text Mining em Textos completos
title_full Ferramenta para Text Mining em Textos completos
title_fullStr Ferramenta para Text Mining em Textos completos
title_full_unstemmed Ferramenta para Text Mining em Textos completos
title_sort Ferramenta para Text Mining em Textos completos
author Hugo José Freixo Rodrigues
author_facet Hugo José Freixo Rodrigues
author_role author
dc.contributor.author.fl_str_mv Hugo José Freixo Rodrigues
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description We live in a world in constant change where more and more people have unlimited access to the internet, where they can post their ideas. This means that there more and more texts available on the Web. These texts can be simple facebook posts or papers about various areas. With a large amount of texts, new techniques are needed to read and classify them quickly and effectively. Text Mining (TM) comes up to solve these problems. With this powerful tool it is possible to interpret and classify a huge amount of texts, in order to obtain useful information. Current TM approaches do not take advantage of structure of the texts. A text is seen as a bag of words, a set of unrelated words. This causes the TM algorithms become computationally heavy and their quality of information obtained can be substantially improved. There are different approaches of TM and several steps of preprocessing that can be applied to full text classification, but which methods that can get better results? Which steps must be applied so the final result is more complete? These are some questions that we address in this thesis.
publishDate 2016
dc.date.none.fl_str_mv 2016-07-27
2016-07-27T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/85394
TID:201305011
url https://hdl.handle.net/10216/85394
identifier_str_mv TID:201305011
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135542894395392