Data warehouse design to support social media analysis in a big data environment
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.3844/JCSSP.2020.126.136 http://hdl.handle.net/11449/201899 |
Resumo: | The volume of generated and stored data from social media has increased in the last decade. Therefore, analyzing and understanding this kind of data can offer relevant information in different contexts and can assist researchers and companies in the decision-making process. However, the data are scattered in a large volume, come from different sources, with different formats and are rapidly created. Such facts make the knowledge extraction difficult, turning it in a complex and high costly process. The scientific contribution of this paper is the development of a social media data integration model based on a data warehouse to reduce the computational costs related to data analysis, as well as support the application of techniques to discover useful knowledge. Differently from the literature, we focus on both social media Facebook and Twitter. Also, we contribute with the proposition of a model for the acquisition, transformation and loading data, which can enable the extraction of useful knowledge in a context where the human capability of understanding is exceeded. The results showed that the proposed data warehouse improves the quality of data mining algorithms compared to related works, while being able to reduce the execution time. |
id |
UNSP_98d49d872514e3b2bca5924e5ac6e2a2 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/201899 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Data warehouse design to support social media analysis in a big data environmentBig dataData miningData warehouseSocial mediaThe volume of generated and stored data from social media has increased in the last decade. Therefore, analyzing and understanding this kind of data can offer relevant information in different contexts and can assist researchers and companies in the decision-making process. However, the data are scattered in a large volume, come from different sources, with different formats and are rapidly created. Such facts make the knowledge extraction difficult, turning it in a complex and high costly process. The scientific contribution of this paper is the development of a social media data integration model based on a data warehouse to reduce the computational costs related to data analysis, as well as support the application of techniques to discover useful knowledge. Differently from the literature, we focus on both social media Facebook and Twitter. Also, we contribute with the proposition of a model for the acquisition, transformation and loading data, which can enable the extraction of useful knowledge in a context where the human capability of understanding is exceeded. The results showed that the proposed data warehouse improves the quality of data mining algorithms compared to related works, while being able to reduce the execution time.Institute of Biosciences São Paulo State University (Unesp) Humanities and Exact Sciences (Ibilce), Campus São José do Rio PretoFluminense Federal University (UFF)Institute of Biosciences São Paulo State University (Unesp) Humanities and Exact Sciences (Ibilce), Campus São José do Rio PretoUniversidade Estadual Paulista (Unesp)Fluminense Federal University (UFF)Valêncio, Carlos Roberto [UNESP]Silva, Luis Marcello Moraes [UNESP]Tenório, William [UNESP]Zafalon, Geraldo Francisco Donegá [UNESP]Colombini, Angelo CesarFortes, Márcio Zamboti2020-12-12T02:44:46Z2020-12-12T02:44:46Z2020-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article126-136http://dx.doi.org/10.3844/JCSSP.2020.126.136Journal of Computer Science, v. 16, n. 2, p. 126-136, 2020.1552-66071549-3636http://hdl.handle.net/11449/20189910.3844/JCSSP.2020.126.1362-s2.0-85086862861Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengJournal of Computer Scienceinfo:eu-repo/semantics/openAccess2021-10-23T03:03:15Zoai:repositorio.unesp.br:11449/201899Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:52:13.817220Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Data warehouse design to support social media analysis in a big data environment |
title |
Data warehouse design to support social media analysis in a big data environment |
spellingShingle |
Data warehouse design to support social media analysis in a big data environment Valêncio, Carlos Roberto [UNESP] Big data Data mining Data warehouse Social media |
title_short |
Data warehouse design to support social media analysis in a big data environment |
title_full |
Data warehouse design to support social media analysis in a big data environment |
title_fullStr |
Data warehouse design to support social media analysis in a big data environment |
title_full_unstemmed |
Data warehouse design to support social media analysis in a big data environment |
title_sort |
Data warehouse design to support social media analysis in a big data environment |
author |
Valêncio, Carlos Roberto [UNESP] |
author_facet |
Valêncio, Carlos Roberto [UNESP] Silva, Luis Marcello Moraes [UNESP] Tenório, William [UNESP] Zafalon, Geraldo Francisco Donegá [UNESP] Colombini, Angelo Cesar Fortes, Márcio Zamboti |
author_role |
author |
author2 |
Silva, Luis Marcello Moraes [UNESP] Tenório, William [UNESP] Zafalon, Geraldo Francisco Donegá [UNESP] Colombini, Angelo Cesar Fortes, Márcio Zamboti |
author2_role |
author author author author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (Unesp) Fluminense Federal University (UFF) |
dc.contributor.author.fl_str_mv |
Valêncio, Carlos Roberto [UNESP] Silva, Luis Marcello Moraes [UNESP] Tenório, William [UNESP] Zafalon, Geraldo Francisco Donegá [UNESP] Colombini, Angelo Cesar Fortes, Márcio Zamboti |
dc.subject.por.fl_str_mv |
Big data Data mining Data warehouse Social media |
topic |
Big data Data mining Data warehouse Social media |
description |
The volume of generated and stored data from social media has increased in the last decade. Therefore, analyzing and understanding this kind of data can offer relevant information in different contexts and can assist researchers and companies in the decision-making process. However, the data are scattered in a large volume, come from different sources, with different formats and are rapidly created. Such facts make the knowledge extraction difficult, turning it in a complex and high costly process. The scientific contribution of this paper is the development of a social media data integration model based on a data warehouse to reduce the computational costs related to data analysis, as well as support the application of techniques to discover useful knowledge. Differently from the literature, we focus on both social media Facebook and Twitter. Also, we contribute with the proposition of a model for the acquisition, transformation and loading data, which can enable the extraction of useful knowledge in a context where the human capability of understanding is exceeded. The results showed that the proposed data warehouse improves the quality of data mining algorithms compared to related works, while being able to reduce the execution time. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-12-12T02:44:46Z 2020-12-12T02:44:46Z 2020-01-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.3844/JCSSP.2020.126.136 Journal of Computer Science, v. 16, n. 2, p. 126-136, 2020. 1552-6607 1549-3636 http://hdl.handle.net/11449/201899 10.3844/JCSSP.2020.126.136 2-s2.0-85086862861 |
url |
http://dx.doi.org/10.3844/JCSSP.2020.126.136 http://hdl.handle.net/11449/201899 |
identifier_str_mv |
Journal of Computer Science, v. 16, n. 2, p. 126-136, 2020. 1552-6607 1549-3636 10.3844/JCSSP.2020.126.136 2-s2.0-85086862861 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Journal of Computer Science |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
126-136 |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128285116203008 |