Preprocessing profiling model for visual analytics

Detalhes bibliográficos
Autor(a) principal: Milani, Alessandra Maciel Paz
Data de Publicação: 2019
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da PUC_RS
Texto Completo: http://tede2.pucrs.br/tede2/handle/tede/9007
Resumo: In the information age, we have evolved the ability to collect and store data, create sophisticated data mining methods, and generate rich visualizations to share the information resulting from the data analysis process. However, analyzing and managing raw data is still a challenging part of this process, mainly with regards to data preprocessing, which aims to transform this raw data into an appropriate format for subsequent analysis. Although we can find studies proposing design implications or recommendations for future visualiza- tion solutions in the data analysis scope, they do not focus on the challenges during the Preprocessing phase and on how visualization can support it. Likewise, the current Visual Analytics Models are not considering preprocessing an equally important phase in their process, such as Data, Models, Visualization, and Knowledge. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we are introducing the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. The main contributions in our study are three: (a) the Preprocessing Profiling Model for Visual Analytics as a solution to assist during Preprocessing phase. (b) The list of ten insights, as a consolidated set of requirements for future visualization research studies applied to preprocessing and data mining. (c) The details on the profile of the data analysts, the main challenges they face, and the opportunities that arise while they are engaged in data mining projects in diverse organizational areas
id P_RS_3b3cb8fe612c38dd907ee1b0a3abcaf9
oai_identifier_str oai:tede2.pucrs.br:tede/9007
network_acronym_str P_RS
network_name_str Biblioteca Digital de Teses e Dissertações da PUC_RS
repository_id_str
spelling Manssour, Isabel Harbhttp://lattes.cnpq.br/4904489502853690Paulovich, Fernando Vieirahttp://lattes.cnpq.br/4328003866597876http://lattes.cnpq.br/5764437814022359Milani, Alessandra Maciel Paz2019-11-06T13:40:27Z2019-08-29http://tede2.pucrs.br/tede2/handle/tede/9007In the information age, we have evolved the ability to collect and store data, create sophisticated data mining methods, and generate rich visualizations to share the information resulting from the data analysis process. However, analyzing and managing raw data is still a challenging part of this process, mainly with regards to data preprocessing, which aims to transform this raw data into an appropriate format for subsequent analysis. Although we can find studies proposing design implications or recommendations for future visualiza- tion solutions in the data analysis scope, they do not focus on the challenges during the Preprocessing phase and on how visualization can support it. Likewise, the current Visual Analytics Models are not considering preprocessing an equally important phase in their process, such as Data, Models, Visualization, and Knowledge. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we are introducing the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. The main contributions in our study are three: (a) the Preprocessing Profiling Model for Visual Analytics as a solution to assist during Preprocessing phase. (b) The list of ten insights, as a consolidated set of requirements for future visualization research studies applied to preprocessing and data mining. (c) The details on the profile of the data analysts, the main challenges they face, and the opportunities that arise while they are engaged in data mining projects in diverse organizational areasNa era da informação, desenvolvemos a capacidade de coletar e armazenar dados, criar métodos sofisticados de mineração de dados e gerar visualizações ricas para compartilhar as informações resultantes do processo de análise de dados. No entanto, analisar e gerenciar dados brutos ainda é uma parte desafiadora desse processo, principalmente no que diz respeito ao pré-processamento de dados, que visa transformar esses dados brutos em um formato apropriado para análises subsequentes. Embora possamos encontrar estudos propondo implicações ou recomendações para futuras soluções de visualização no escopo da análise de dados, eles não se concentram nos desafios da fase de pré-processamento, nem em como a visualização pode suportá-la. Da mesma forma, os modelos atuais de análise visual não consideram o pré-processamento como uma fase igualmente importante em seus processos. Assim, com este estudo, pretendemos contribuir para a discussão de como podemos usar e combinar métodos de visualização e mineração de dados para auxiliar os analistas de dados durante as atividades de pré-processamento. Para isso, apresentamos um modelo de pré-processamento com análise visual, que contempla um conjunto de recursos para inspirar a implementação de novas soluções. Por sua vez, esses recursos foram projetados considerando uma lista de ideias(Insights) que obtivemos durante um estudo de entrevista com treze analistas de dados. As principais contribuições de nosso estudo são três: (a) O modelo de análise visual para auxiliar durante a fase de pré-processamento. (b) A lista de dez Insights, como um conjunto consolidado de requisitos para futuros estudos de pesquisa de visualização aplicados ao pré-processamento e à mineração de dados. (c) Os detalhes sobre o perfil dos analistas de dados, os principais desafios que eles enfrentam e as oportunidades que surgem enquanto eles estão envolvidos em projetos de mineração de dados em diversas áreas da organizaçãoSubmitted by PPG Ciência da Computação (ppgcc@pucrs.br) on 2019-11-01T13:03:06Z No. of bitstreams: 1 ALESSANDRA MACIELPAZMILANI_DIS.pdf: 5008530 bytes, checksum: 4236876d844c46bb90b0d6d4794d082e (MD5)Approved for entry into archive by Sarajane Pan (sarajane.pan@pucrs.br) on 2019-11-06T13:33:54Z (GMT) No. of bitstreams: 1 ALESSANDRA MACIELPAZMILANI_DIS.pdf: 5008530 bytes, checksum: 4236876d844c46bb90b0d6d4794d082e (MD5)Made available in DSpace on 2019-11-06T13:40:27Z (GMT). No. of bitstreams: 1 ALESSANDRA MACIELPAZMILANI_DIS.pdf: 5008530 bytes, checksum: 4236876d844c46bb90b0d6d4794d082e (MD5) Previous issue date: 2019-08-29application/pdfhttp://tede2.pucrs.br:80/tede2/retrieve/177134/ALESSANDRA%20MACIELPAZMILANI_DIS.pdf.jpghttp://tede2.pucrs.br:80/tede2/retrieve/177221/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpgengPontifícia Universidade Católica do Rio Grande do SulPrograma de Pós-Graduação em Ciência da ComputaçãoPUCRSBrasilEscola PolitécnicaVisual AnalyticsVisualization TechniquesData MiningPreprocessingAnálise VisualTécnicas de VisualizaçãoMineração de DadosPré- ProcessamentoCIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAOPreprocessing profiling model for visual analyticsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisTrabalho não apresenta restrição para publicação-4570527706994352458500500-862078257083325301info:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_RSinstname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)instacron:PUC_RSORIGINALDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdfDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdfapplication/pdf4864871http://tede2.pucrs.br/tede2/bitstream/tede/9007/5/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf050fba58727c861e732e245a707390c7MD55THUMBNAILALESSANDRA MACIELPAZMILANI_DIS.pdf.jpgALESSANDRA MACIELPAZMILANI_DIS.pdf.jpgimage/jpeg5565http://tede2.pucrs.br/tede2/bitstream/tede/9007/4/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.jpg596076026fe6bc78e6f8d0f278778917MD54DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpgDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpgimage/jpeg5565http://tede2.pucrs.br/tede2/bitstream/tede/9007/7/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpg596076026fe6bc78e6f8d0f278778917MD57TEXTALESSANDRA MACIELPAZMILANI_DIS.pdf.txtALESSANDRA MACIELPAZMILANI_DIS.pdf.txttext/plain219100http://tede2.pucrs.br/tede2/bitstream/tede/9007/3/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.txtb8765a67439b834985235f410062dffcMD53DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txtDIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txttext/plain219100http://tede2.pucrs.br/tede2/bitstream/tede/9007/6/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txtb8765a67439b834985235f410062dffcMD56LICENSElicense.txtlicense.txttext/plain; charset=utf-8590http://tede2.pucrs.br/tede2/bitstream/tede/9007/1/license.txt220e11f2d3ba5354f917c7035aadef24MD51tede/90072019-11-21 18:02:58.984oai:tede2.pucrs.br:tede/9007QXV0b3JpemE/P28gcGFyYSBQdWJsaWNhPz9vIEVsZXRyP25pY2E6IENvbSBiYXNlIG5vIGRpc3Bvc3RvIG5hIExlaSBGZWRlcmFsIG4/OS42MTAsIGRlIDE5IGRlIGZldmVyZWlybyBkZSAxOTk4LCBvIGF1dG9yIEFVVE9SSVpBIGEgcHVibGljYT8/byBlbGV0cj9uaWNhIGRhIHByZXNlbnRlIG9icmEgbm8gYWNlcnZvIGRhIEJpYmxpb3RlY2EgRGlnaXRhbCBkYSBQb250aWY/Y2lhIFVuaXZlcnNpZGFkZSBDYXQ/bGljYSBkbyBSaW8gR3JhbmRlIGRvIFN1bCwgc2VkaWFkYSBhIEF2LiBJcGlyYW5nYSA2NjgxLCBQb3J0byBBbGVncmUsIFJpbyBHcmFuZGUgZG8gU3VsLCBjb20gcmVnaXN0cm8gZGUgQ05QSiA4ODYzMDQxMzAwMDItODEgYmVtIGNvbW8gZW0gb3V0cmFzIGJpYmxpb3RlY2FzIGRpZ2l0YWlzLCBuYWNpb25haXMgZSBpbnRlcm5hY2lvbmFpcywgY29ucz9yY2lvcyBlIHJlZGVzID9zIHF1YWlzIGEgYmlibGlvdGVjYSBkYSBQVUNSUyBwb3NzYSBhIHZpciBwYXJ0aWNpcGFyLCBzZW0gP251cyBhbHVzaXZvIGFvcyBkaXJlaXRvcyBhdXRvcmFpcywgYSB0P3R1bG8gZGUgZGl2dWxnYT8/byBkYSBwcm9kdT8/byBjaWVudD9maWNhLgo=Biblioteca Digital de Teses e Dissertaçõeshttp://tede2.pucrs.br/tede2/PRIhttps://tede2.pucrs.br/oai/requestbiblioteca.central@pucrs.br||opendoar:2019-11-21T20:02:58Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)false
dc.title.por.fl_str_mv Preprocessing profiling model for visual analytics
title Preprocessing profiling model for visual analytics
spellingShingle Preprocessing profiling model for visual analytics
Milani, Alessandra Maciel Paz
Visual Analytics
Visualization Techniques
Data Mining
Preprocessing
Análise Visual
Técnicas de Visualização
Mineração de Dados
Pré- Processamento
CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
title_short Preprocessing profiling model for visual analytics
title_full Preprocessing profiling model for visual analytics
title_fullStr Preprocessing profiling model for visual analytics
title_full_unstemmed Preprocessing profiling model for visual analytics
title_sort Preprocessing profiling model for visual analytics
author Milani, Alessandra Maciel Paz
author_facet Milani, Alessandra Maciel Paz
author_role author
dc.contributor.advisor1.fl_str_mv Manssour, Isabel Harb
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/4904489502853690
dc.contributor.advisor-co1.fl_str_mv Paulovich, Fernando Vieira
dc.contributor.advisor-co1Lattes.fl_str_mv http://lattes.cnpq.br/4328003866597876
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/5764437814022359
dc.contributor.author.fl_str_mv Milani, Alessandra Maciel Paz
contributor_str_mv Manssour, Isabel Harb
Paulovich, Fernando Vieira
dc.subject.eng.fl_str_mv Visual Analytics
Visualization Techniques
Data Mining
Preprocessing
topic Visual Analytics
Visualization Techniques
Data Mining
Preprocessing
Análise Visual
Técnicas de Visualização
Mineração de Dados
Pré- Processamento
CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
dc.subject.por.fl_str_mv Análise Visual
Técnicas de Visualização
Mineração de Dados
Pré- Processamento
dc.subject.cnpq.fl_str_mv CIENCIA DA COMPUTACAO::TEORIA DA COMPUTACAO
description In the information age, we have evolved the ability to collect and store data, create sophisticated data mining methods, and generate rich visualizations to share the information resulting from the data analysis process. However, analyzing and managing raw data is still a challenging part of this process, mainly with regards to data preprocessing, which aims to transform this raw data into an appropriate format for subsequent analysis. Although we can find studies proposing design implications or recommendations for future visualiza- tion solutions in the data analysis scope, they do not focus on the challenges during the Preprocessing phase and on how visualization can support it. Likewise, the current Visual Analytics Models are not considering preprocessing an equally important phase in their process, such as Data, Models, Visualization, and Knowledge. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we are introducing the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. The main contributions in our study are three: (a) the Preprocessing Profiling Model for Visual Analytics as a solution to assist during Preprocessing phase. (b) The list of ten insights, as a consolidated set of requirements for future visualization research studies applied to preprocessing and data mining. (c) The details on the profile of the data analysts, the main challenges they face, and the opportunities that arise while they are engaged in data mining projects in diverse organizational areas
publishDate 2019
dc.date.accessioned.fl_str_mv 2019-11-06T13:40:27Z
dc.date.issued.fl_str_mv 2019-08-29
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://tede2.pucrs.br/tede2/handle/tede/9007
url http://tede2.pucrs.br/tede2/handle/tede/9007
dc.language.iso.fl_str_mv eng
language eng
dc.relation.program.fl_str_mv -4570527706994352458
dc.relation.confidence.fl_str_mv 500
500
dc.relation.cnpq.fl_str_mv -862078257083325301
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Pontifícia Universidade Católica do Rio Grande do Sul
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv PUCRS
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Escola Politécnica
publisher.none.fl_str_mv Pontifícia Universidade Católica do Rio Grande do Sul
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da PUC_RS
instname:Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron:PUC_RS
instname_str Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
instacron_str PUC_RS
institution PUC_RS
reponame_str Biblioteca Digital de Teses e Dissertações da PUC_RS
collection Biblioteca Digital de Teses e Dissertações da PUC_RS
bitstream.url.fl_str_mv http://tede2.pucrs.br/tede2/bitstream/tede/9007/5/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf
http://tede2.pucrs.br/tede2/bitstream/tede/9007/4/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.jpg
http://tede2.pucrs.br/tede2/bitstream/tede/9007/7/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.jpg
http://tede2.pucrs.br/tede2/bitstream/tede/9007/3/ALESSANDRA+MACIELPAZMILANI_DIS.pdf.txt
http://tede2.pucrs.br/tede2/bitstream/tede/9007/6/DIS_ALESSANDRA_MACIEL_PAZ_MILANI_COMPLETO.pdf.txt
http://tede2.pucrs.br/tede2/bitstream/tede/9007/1/license.txt
bitstream.checksum.fl_str_mv 050fba58727c861e732e245a707390c7
596076026fe6bc78e6f8d0f278778917
596076026fe6bc78e6f8d0f278778917
b8765a67439b834985235f410062dffc
b8765a67439b834985235f410062dffc
220e11f2d3ba5354f917c7035aadef24
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da PUC_RS - Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS)
repository.mail.fl_str_mv biblioteca.central@pucrs.br||
_version_ 1799765343419236352