Preprocessing profiling model for visual analytics
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da PUC_RS |
Texto Completo: | http://tede2.pucrs.br/tede2/handle/tede/9007 |
Resumo: | In the information age, we have evolved the ability to collect and store data, create sophisticated data mining methods, and generate rich visualizations to share the information resulting from the data analysis process. However, analyzing and managing raw data is still a challenging part of this process, mainly with regards to data preprocessing, which aims to transform this raw data into an appropriate format for subsequent analysis. Although we can find studies proposing design implications or recommendations for future visualiza- tion solutions in the data analysis scope, they do not focus on the challenges during the Preprocessing phase and on how visualization can support it. Likewise, the current Visual Analytics Models are not considering preprocessing an equally important phase in their process, such as Data, Models, Visualization, and Knowledge. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we are introducing the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. The main contributions in our study are three: (a) the Preprocessing Profiling Model for Visual Analytics as a solution to assist during Preprocessing phase. (b) The list of ten insights, as a consolidated set of requirements for future visualization research studies applied to preprocessing and data mining. (c) The details on the profile of the data analysts, the main challenges they face, and the opportunities that arise while they are engaged in data mining projects in diverse organizational areas |
id |
P_RS_3b3cb8fe612c38dd907ee1b0a3abcaf9 |
---|---|
oai_identifier_str |
oai:tede2.pucrs.br:tede/9007 |
network_acronym_str |
P_RS |
network_name_str |
Biblioteca Digital de Teses e Dissertações da PUC_RS |
repository_id_str |
|
spelling |
Manssour, Isabel Harbhttp://lattes.cnpq.br/4904489502853690Paulovich, Fernando Vieirahttp://lattes.cnpq.br/4328003866597876http://lattes.cnpq.br/5764437814022359Milani, Alessandra Maciel Paz2019-11-06T13:40:27Z2019-08-29http://tede2.pucrs.br/tede2/handle/tede/9007In the information age, we have evolved the ability to collect and store data, create sophisticated data mining methods, and generate rich visualizations to share the information resulting from the data analysis process. However, analyzing and managing raw data is still a challenging part of this process, mainly with regards to data preprocessing, which aims to transform this raw data into an appropriate format for subsequent analysis. Although we can find studies proposing design implications or recommendations for future visualiza- tion solutions in the data analysis scope, they do not focus on the challenges during the Preprocessing phase and on how visualization can support it. Likewise, the current Visual Analytics Models are not considering preprocessing an equally important phase in their process, such as Data, Models, Visualization, and Knowledge. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we are introducing the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. The main contributions in our study are three: (a) the Preprocessing Profiling Model for Visual Analytics as a solution to assist during Preprocessing phase. (b) The list of ten insights, as a consolidated set of requirements for future visualization research studies applied to preprocessing and data mining. (c) The details on the profile of the data analysts, the main challenges they face, and the opportunities that arise while they are engaged in data mining projects in diverse organizational areasNa era da informação, desenvolvemos a capacidade de coletar e armazenar dados, criar métodos sofisticados de mineração de dados e gerar visualizações ricas para compartilhar as informações resultantes do processo de análise de dados. No entanto, analisar e gerenciar dados brutos ainda é uma parte desafiadora desse processo, principalmente no que diz respeito ao pré-processamento de dados, que visa transformar esses dados brutos em um formato apropriado para análises subsequentes. Embora possamos encontrar estudos propondo implicações ou recomendações para futuras soluções de visualização no escopo da análise de dados, eles não se concentram nos desafios da fase de pré-processamento, nem em como a visualização pode suportá-la. Da mesma forma, os modelos atuais de análise visual não consideram o pré-processamento como uma fase igualmente importante em seus processos. Assim, com este estudo, pretendemos contribuir para a discussão de como podemos usar e combinar métodos de visualização e mineração de dados para auxiliar os analistas de dados durante as atividades de pré-processamento. Para isso, apresentamos um modelo de pré-processamento com análise visual, que contempla um conjunto de recursos para inspirar a implementação de novas soluções. Por sua vez, esses recursos foram projetados considerando uma lista de ideias(Insights) que obtivemos durante um estudo de entrevista com treze analistas de dados. As principais contribuições de nosso estudo são três: (a) O modelo de análise visual para auxiliar durante a fase de pré-processamento. (b) A lista de dez Insights, como um conjunto consolidado de requisitos para futuros estudos de pesquisa de visualização aplicados ao pré-processamento e à mineração de dados. (c) Os detalhes sobre o perfil dos analistas de dados, os principais desafios que eles enfrentam e as oportunidades que surgem enquanto eles estão envolvidos em projetos de mineração de dados em diversas áreas da organizaçãoSubmitted by PPG Ciência da Computação (ppgcc@pucrs.br) on 2019-11-01T13:03:06Z No. of bitstreams: 1 ALESSANDRA MACIELPAZMILANI_DIS.pdf: 5008530 bytes, checksum: 4236876d844c46bb90b0d6d4794d082e (MD5)Approved for entry into archive by Sarajane Pan (sarajane.pan@pucrs.br) on 2019-11-06T13:33:54Z (GMT) No. of bitstreams: 1 ALESSANDRA MACIELPAZMILANI_DIS.pdf: 5008530 bytes, checksum: 4236876d844c46bb90b0d6d4794d082e (MD5)Made available in DSpace on 2019-11-06T13:40:27Z (GMT). No. of bitstreams: 1 ALESSANDRA MACIELPAZMILANI_DIS.pdf: 5008530 bytes, checksum: 4236876d844c46bb90b0d6d4794d082e (MD5) Previous issue date: 2019-08-29application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/177134/ALESSANDRA%20MACIELPAZMILANI_DIS.pdf.jpghttp://tede2.pucrs.br:80/tede2/retrieve/177221/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpgengPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilEscola PolitécnicaVisual AnalyticsVisualization TechniquesData MiningPreprocessingAnálise VisualTécnicas de VisualizaçãoMineração de DadosPré- ProcessamentoCIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAOPreprocessing profiling model for visual analyticsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisTrabalho não apresenta restrição para publicação-4570527706994352458500500-862078257083325301info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSORIGINALDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdfDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdfapplication/pdf4864871http://tede2.pucrs.br/tede2/bitstream/tede/9007/5/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf050fba58727c861e732e245a707390c7MD55THUMBNAILALESSANDRA MACIELPAZMILANI_DIS.pdf.jpgALESSANDRA MACIELPAZMILANI_DIS.pdf.jpgimage/jpeg5565http://tede2.pucrs.br/tede2/bitstream/tede/9007/4/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.jpg596076026fe6bc78e6f8d0f278778917MD54DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpgDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpgimage/jpeg5565http://tede2.pucrs.br/tede2/bitstream/tede/9007/7/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpg596076026fe6bc78e6f8d0f278778917MD57TEXTALESSANDRA MACIELPAZMILANI_DIS.pdf.txtALESSANDRA MACIELPAZMILANI_DIS.pdf.txttext/plain219100http://tede2.pucrs.br/tede2/bitstream/tede/9007/3/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.txtb8765a67439b834985235f410062dffcMD53DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txtDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txttext/plain219100http://tede2.pucrs.br/tede2/bitstream/tede/9007/6/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txtb8765a67439b834985235f410062dffcMD56LICENSElicense.txtlicense.txttext/plain; charset=utf-8590http://tede2.pucrs.br/tede2/bitstream/tede/9007/1/license.txt220e11f2d3ba5354f917c7035aadef24MD51tede/90072019-11-21 18:02:58.984oai:tede2.pucrs.br:tede/9007QXV0b3JpemE/P28gcGFyYSBQdWJsaWNhPz9vIEVsZXRyP25pY2E6IENvbSBiYXNlIG5vIGRpc3Bvc3RvIG5hIExlaSBGZWRlcmFsIG4/OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYT8/byBlbGV0cj9uaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWY/Y2lhIFVuaXZlcnNpZGFkZSBDYXQ/bGljYSBkbyBSaW8gR3JhbmRlIGRvIFN1bCwgc2VkaWFkYSBhIEF2LiBJcGlyYW5nYSA2NjgxLCBQb3J0byBBbGVncmUsIFJpbyBHcmFuZGUgZG8gU3VsLCBjb20gcmVnaXN0cm8gZGUgQ05QSiA4ODYzMDQxMzAwMDItODEgYmVtIGNvbW8gZW0gb3V0cmFzIGJpYmxpb3RlY2FzIGRpZ2l0YWlzLCBuYWNpb25haXMgZSBpbnRlcm5hY2lvbmFpcywgY29ucz9yY2lvcyBlIHJlZGVzID9zIHF1YWlzIGEgYmlibGlvdGVjYSBkYSBQVUNSUyBwb3NzYSBhIHZpciBwYXJ0aWNpcGFyLCBzZW0gP251cyBhbHVzaXZvIGFvcyBkaXJlaXRvcyBhdXRvcmFpcywgYSB0P3R1bG8gZGUgZGl2dWxnYT8/byBkYSBwcm9kdT8/byBjaWVudD9maWNhLgo=Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br||opendoar:2019-11-21T20:02:58Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false |
dc.title.por.fl_str_mv |
Preprocessing profiling model for visual analytics |
title |
Preprocessing profiling model for visual analytics |
spellingShingle |
Preprocessing profiling model for visual analytics Milani, Alessandra Maciel Paz Visual Analytics Visualization Techniques Data Mining Preprocessing Análise Visual Técnicas de Visualização Mineração de Dados Pré- Processamento CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO |
title_short |
Preprocessing profiling model for visual analytics |
title_full |
Preprocessing profiling model for visual analytics |
title_fullStr |
Preprocessing profiling model for visual analytics |
title_full_unstemmed |
Preprocessing profiling model for visual analytics |
title_sort |
Preprocessing profiling model for visual analytics |
author |
Milani, Alessandra Maciel Paz |
author_facet |
Milani, Alessandra Maciel Paz |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Manssour, Isabel Harb |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/4904489502853690 |
dc.contributor.advisor-co1.fl_str_mv |
Paulovich, Fernando Vieira |
dc.contributor.advisor-co1Lattes.fl_str_mv |
http://lattes.cnpq.br/4328003866597876 |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/5764437814022359 |
dc.contributor.author.fl_str_mv |
Milani, Alessandra Maciel Paz |
contributor_str_mv |
Manssour, Isabel Harb Paulovich, Fernando Vieira |
dc.subject.eng.fl_str_mv |
Visual Analytics Visualization Techniques Data Mining Preprocessing |
topic |
Visual Analytics Visualization Techniques Data Mining Preprocessing Análise Visual Técnicas de Visualização Mineração de Dados Pré- Processamento CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO |
dc.subject.por.fl_str_mv |
Análise Visual Técnicas de Visualização Mineração de Dados Pré- Processamento |
dc.subject.cnpq.fl_str_mv |
CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO |
description |
In the information age, we have evolved the ability to collect and store data, create sophisticated data mining methods, and generate rich visualizations to share the information resulting from the data analysis process. However, analyzing and managing raw data is still a challenging part of this process, mainly with regards to data preprocessing, which aims to transform this raw data into an appropriate format for subsequent analysis. Although we can find studies proposing design implications or recommendations for future visualiza- tion solutions in the data analysis scope, they do not focus on the challenges during the Preprocessing phase and on how visualization can support it. Likewise, the current Visual Analytics Models are not considering preprocessing an equally important phase in their process, such as Data, Models, Visualization, and Knowledge. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we are introducing the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. The main contributions in our study are three: (a) the Preprocessing Profiling Model for Visual Analytics as a solution to assist during Preprocessing phase. (b) The list of ten insights, as a consolidated set of requirements for future visualization research studies applied to preprocessing and data mining. (c) The details on the profile of the data analysts, the main challenges they face, and the opportunities that arise while they are engaged in data mining projects in diverse organizational areas |
publishDate |
2019 |
dc.date.accessioned.fl_str_mv |
2019-11-06T13:40:27Z |
dc.date.issued.fl_str_mv |
2019-08-29 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://tede2.pucrs.br/tede2/handle/tede/9007 |
url |
http://tede2.pucrs.br/tede2/handle/tede/9007 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.program.fl_str_mv |
-4570527706994352458 |
dc.relation.confidence.fl_str_mv |
500 500 |
dc.relation.cnpq.fl_str_mv |
-862078257083325301 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Pontifícia Universidade Católica do Rio Grande do Sul |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Ciência da Computação |
dc.publisher.initials.fl_str_mv |
PUCRS |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
Escola Politécnica |
publisher.none.fl_str_mv |
Pontifícia Universidade Católica do Rio Grande do Sul |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) instacron:PUC_RS |
instname_str |
Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) |
instacron_str |
PUC_RS |
institution |
PUC_RS |
reponame_str |
Biblioteca Digital de Teses e Dissertações da PUC_RS |
collection |
Biblioteca Digital de Teses e Dissertações da PUC_RS |
bitstream.url.fl_str_mv |
http://tede2.pucrs.br/tede2/bitstream/tede/9007/5/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf http://tede2.pucrs.br/tede2/bitstream/tede/9007/4/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/9007/7/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpg http://tede2.pucrs.br/tede2/bitstream/tede/9007/3/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/9007/6/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txt http://tede2.pucrs.br/tede2/bitstream/tede/9007/1/license.txt |
bitstream.checksum.fl_str_mv |
050fba58727c861e732e245a707390c7 596076026fe6bc78e6f8d0f278778917 596076026fe6bc78e6f8d0f278778917 b8765a67439b834985235f410062dffc b8765a67439b834985235f410062dffc 220e11f2d3ba5354f917c7035aadef24 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) |
repository.mail.fl_str_mv |
biblioteca.central@pucrs.br|| |
_version_ |
1799765343419236352 |