Transforming texts to maps : geovisualizing topics in texts

Detalhes bibliográficos
Autor(a) principal: Thapa, Mahesh
Data de Publicação: 2018
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/33944
Resumo: Dissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial Technologies
id RCAP_8be590c627b7f11d663a69fe818a6493
oai_identifier_str oai:run.unl.pt:10362/33944
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str
spelling Transforming texts to maps : geovisualizing topics in textsText MiningTopic ModellingGeoparsingNatural Language ProcessingGeoparsingGeovisualizationSpatial ContextDissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial TechnologiesUnstructured textual data is one of the most dominant forms of communication. Especially after the adoption of Web 2.0, there has been a massive surge in the rate of generation of unstructured textual data. While a large amount of information is intuitively better for proper decision-making, it also means that it becomes virtually impossible to manually process, discover and extract useful information from textual data. Several supervised and unsupervised techniques in text mining have been developed to classify, cluster and extract information from texts. While text data mining provides insight to the contents of the texts, these techniques do not provide insights to the location component of the texts. In simple terms, text data mining addresses “What is the text about?” but fails to answer the “Where is the text about?” Since textual data have a large amount of geographic content (estimates of about 80%), it can be safely reasoned that answering “Where is the text about?” adds significant insights about the texts. In this study, a collection of news articles from the year 2017 were analyzed using topic modelling, an unsupervised text mining technique. Topics were discovered from the text collections using Latent Dirichlet Allocation method, a popular topic modelling technique. Topics are probability distribution of words which correspond to one of the concepts covered in the text. Spatial locations were extracted from text documents by geoparsing them. Topics were geovisualized as interactive maps according to the probability of each spatial location word which contributed to the corresponding topic. This is analogous to thematic mapping in Geographical Information System. Coordinates obtained from geoparsed words provide basis for georeferencing the topics while the probability of such location words corresponding to the particular topics provide the attribute value for thematic mapping. An interactive geovisualization of Choropleth maps at the level of country was constructed using the Leaflet visualization library. A comparative analysis between the maps and corresponding topics was made to see if the maps provided spatial context to the topics.Ramos Romero, José FranciscoFernández, Oscar BelmonteHenriques, Roberto André PereiraRUNThapa, Mahesh2018-04-06T09:29:11Z2018-03-022018-03-02T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/33944TID:201894726enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-07-10T15:43:22ZPortal AgregadorONG
dc.title.none.fl_str_mv Transforming texts to maps : geovisualizing topics in texts
title Transforming texts to maps : geovisualizing topics in texts
spellingShingle Transforming texts to maps : geovisualizing topics in texts
Thapa, Mahesh
Text Mining
Topic Modelling
Geoparsing
Natural Language Processing
Geoparsing
Geovisualization
Spatial Context
title_short Transforming texts to maps : geovisualizing topics in texts
title_full Transforming texts to maps : geovisualizing topics in texts
title_fullStr Transforming texts to maps : geovisualizing topics in texts
title_full_unstemmed Transforming texts to maps : geovisualizing topics in texts
title_sort Transforming texts to maps : geovisualizing topics in texts
author Thapa, Mahesh
author_facet Thapa, Mahesh
author_role author
dc.contributor.none.fl_str_mv Ramos Romero, José Francisco
Fernández, Oscar Belmonte
Henriques, Roberto André Pereira
RUN
dc.contributor.author.fl_str_mv Thapa, Mahesh
dc.subject.por.fl_str_mv Text Mining
Topic Modelling
Geoparsing
Natural Language Processing
Geoparsing
Geovisualization
Spatial Context
topic Text Mining
Topic Modelling
Geoparsing
Natural Language Processing
Geoparsing
Geovisualization
Spatial Context
description Dissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial Technologies
publishDate 2018
dc.date.none.fl_str_mv 2018-04-06T09:29:11Z
2018-03-02
2018-03-02T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/33944
TID:201894726
url http://hdl.handle.net/10362/33944
identifier_str_mv TID:201894726
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv
repository.mail.fl_str_mv
_version_ 1777302961739792385