Ferramenta para Text Mining em Textos completos
Autor(a) principal: | |
---|---|
Data de Publicação: | 2016 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | https://hdl.handle.net/10216/85394 |
Resumo: | We live in a world in constant change where more and more people have unlimited access to the internet, where they can post their ideas. This means that there more and more texts available on the Web. These texts can be simple facebook posts or papers about various areas. With a large amount of texts, new techniques are needed to read and classify them quickly and effectively. Text Mining (TM) comes up to solve these problems. With this powerful tool it is possible to interpret and classify a huge amount of texts, in order to obtain useful information. Current TM approaches do not take advantage of structure of the texts. A text is seen as a bag of words, a set of unrelated words. This causes the TM algorithms become computationally heavy and their quality of information obtained can be substantially improved. There are different approaches of TM and several steps of preprocessing that can be applied to full text classification, but which methods that can get better results? Which steps must be applied so the final result is more complete? These are some questions that we address in this thesis. |
id |
RCAP_ebdcaa8498d2b7082606e34cf4e98a6d |
---|---|
oai_identifier_str |
oai:repositorio-aberto.up.pt:10216/85394 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Ferramenta para Text Mining em Textos completosEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringWe live in a world in constant change where more and more people have unlimited access to the internet, where they can post their ideas. This means that there more and more texts available on the Web. These texts can be simple facebook posts or papers about various areas. With a large amount of texts, new techniques are needed to read and classify them quickly and effectively. Text Mining (TM) comes up to solve these problems. With this powerful tool it is possible to interpret and classify a huge amount of texts, in order to obtain useful information. Current TM approaches do not take advantage of structure of the texts. A text is seen as a bag of words, a set of unrelated words. This causes the TM algorithms become computationally heavy and their quality of information obtained can be substantially improved. There are different approaches of TM and several steps of preprocessing that can be applied to full text classification, but which methods that can get better results? Which steps must be applied so the final result is more complete? These are some questions that we address in this thesis.2016-07-272016-07-27T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/85394TID:201305011porHugo José Freixo Rodriguesinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T12:38:38Zoai:repositorio-aberto.up.pt:10216/85394Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:24:01.642971Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Ferramenta para Text Mining em Textos completos |
title |
Ferramenta para Text Mining em Textos completos |
spellingShingle |
Ferramenta para Text Mining em Textos completos Hugo José Freixo Rodrigues Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
title_short |
Ferramenta para Text Mining em Textos completos |
title_full |
Ferramenta para Text Mining em Textos completos |
title_fullStr |
Ferramenta para Text Mining em Textos completos |
title_full_unstemmed |
Ferramenta para Text Mining em Textos completos |
title_sort |
Ferramenta para Text Mining em Textos completos |
author |
Hugo José Freixo Rodrigues |
author_facet |
Hugo José Freixo Rodrigues |
author_role |
author |
dc.contributor.author.fl_str_mv |
Hugo José Freixo Rodrigues |
dc.subject.por.fl_str_mv |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
topic |
Engenharia electrotécnica, electrónica e informática Electrical engineering, Electronic engineering, Information engineering |
description |
We live in a world in constant change where more and more people have unlimited access to the internet, where they can post their ideas. This means that there more and more texts available on the Web. These texts can be simple facebook posts or papers about various areas. With a large amount of texts, new techniques are needed to read and classify them quickly and effectively. Text Mining (TM) comes up to solve these problems. With this powerful tool it is possible to interpret and classify a huge amount of texts, in order to obtain useful information. Current TM approaches do not take advantage of structure of the texts. A text is seen as a bag of words, a set of unrelated words. This causes the TM algorithms become computationally heavy and their quality of information obtained can be substantially improved. There are different approaches of TM and several steps of preprocessing that can be applied to full text classification, but which methods that can get better results? Which steps must be applied so the final result is more complete? These are some questions that we address in this thesis. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-07-27 2016-07-27T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10216/85394 TID:201305011 |
url |
https://hdl.handle.net/10216/85394 |
identifier_str_mv |
TID:201305011 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799135542894395392 |