Data Integration Solution in an Heterogeneous Environment

Detalhes bibliográficos
Autor(a) principal: Pedro Manuel dos Santos Rocha
Data de Publicação: 2017
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/106487
Resumo: Over the last few years there has been an increase in the attention given to both data collection and knowledge extraction. Recent developments in data storage, distributed systems and parallelization made the analysis of vast amounts of data more straightforward. However, whilst processing large quantities of information has been made simpler there are still some problems that need to be addressed. One of these problems resides in the clean-up of the data collected, meaning the transformation of the information collected into a more useful format from where knowledge can be extracted. Usually this problem is addressed by developing a solution on a case by case basis that has no power of generalization. As expected, this type of solution works well in an environment where the data is well known and with a fixed structure, but if there are changes in the initial structure or the final structure of the information there needs to be an adjustment made to the solution. This brings added complexity that can cause an application to become increasingly difficult to maintain and add new features. The solution that is analyzed throughout this dissertation work is the creation of an application where a user can combine and transform information that originates from different sources. This is made utilizing user-defined configuration documents, so that when a change is made in the system the impact for the end-user is minimized. In order to better test the suitability of the solution, it is going to be developed using a real-world scenario. This scenario is based on an already existing application that collects information from a variety of sources and has the necessity of transforming the information collected into a more useful structure.
id RCAP_b89fd5ac5347f5cc3a381e71a203f05d
oai_identifier_str oai:repositorio-aberto.up.pt:10216/106487
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Data Integration Solution in an Heterogeneous EnvironmentEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringOver the last few years there has been an increase in the attention given to both data collection and knowledge extraction. Recent developments in data storage, distributed systems and parallelization made the analysis of vast amounts of data more straightforward. However, whilst processing large quantities of information has been made simpler there are still some problems that need to be addressed. One of these problems resides in the clean-up of the data collected, meaning the transformation of the information collected into a more useful format from where knowledge can be extracted. Usually this problem is addressed by developing a solution on a case by case basis that has no power of generalization. As expected, this type of solution works well in an environment where the data is well known and with a fixed structure, but if there are changes in the initial structure or the final structure of the information there needs to be an adjustment made to the solution. This brings added complexity that can cause an application to become increasingly difficult to maintain and add new features. The solution that is analyzed throughout this dissertation work is the creation of an application where a user can combine and transform information that originates from different sources. This is made utilizing user-defined configuration documents, so that when a change is made in the system the impact for the end-user is minimized. In order to better test the suitability of the solution, it is going to be developed using a real-world scenario. This scenario is based on an already existing application that collects information from a variety of sources and has the necessity of transforming the information collected into a more useful structure.2017-07-172017-07-17T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/106487TID:201802252engPedro Manuel dos Santos Rochainfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T13:16:52Zoai:repositorio-aberto.up.pt:10216/106487Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:37:26.295125Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Data Integration Solution in an Heterogeneous Environment
title Data Integration Solution in an Heterogeneous Environment
spellingShingle Data Integration Solution in an Heterogeneous Environment
Pedro Manuel dos Santos Rocha
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Data Integration Solution in an Heterogeneous Environment
title_full Data Integration Solution in an Heterogeneous Environment
title_fullStr Data Integration Solution in an Heterogeneous Environment
title_full_unstemmed Data Integration Solution in an Heterogeneous Environment
title_sort Data Integration Solution in an Heterogeneous Environment
author Pedro Manuel dos Santos Rocha
author_facet Pedro Manuel dos Santos Rocha
author_role author
dc.contributor.author.fl_str_mv Pedro Manuel dos Santos Rocha
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description Over the last few years there has been an increase in the attention given to both data collection and knowledge extraction. Recent developments in data storage, distributed systems and parallelization made the analysis of vast amounts of data more straightforward. However, whilst processing large quantities of information has been made simpler there are still some problems that need to be addressed. One of these problems resides in the clean-up of the data collected, meaning the transformation of the information collected into a more useful format from where knowledge can be extracted. Usually this problem is addressed by developing a solution on a case by case basis that has no power of generalization. As expected, this type of solution works well in an environment where the data is well known and with a fixed structure, but if there are changes in the initial structure or the final structure of the information there needs to be an adjustment made to the solution. This brings added complexity that can cause an application to become increasingly difficult to maintain and add new features. The solution that is analyzed throughout this dissertation work is the creation of an application where a user can combine and transform information that originates from different sources. This is made utilizing user-defined configuration documents, so that when a change is made in the system the impact for the end-user is minimized. In order to better test the suitability of the solution, it is going to be developed using a real-world scenario. This scenario is based on an already existing application that collects information from a variety of sources and has the necessity of transforming the information collected into a more useful structure.
publishDate 2017
dc.date.none.fl_str_mv 2017-07-17
2017-07-17T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/106487
TID:201802252
url https://hdl.handle.net/10216/106487
identifier_str_mv TID:201802252
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135687290650624