Novel visual approaches for attribute analysis, selection, and prediction

Detalhes bibliográficos
Autor(a) principal: Júnior, Erasmo Artur da Silva
Data de Publicação: 2020
Tipo de documento: Tese
Idioma: eng
Título da fonte: Biblioteca Digital de Teses e Dissertações da USP
Texto Completo: https://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/
Resumo: While data collection and storage capabilities grow widely nowadays, the general ability to process and analyze large amounts of data increases at a slower rate. This asynchrony introduces new challenges touching methods for large amounts of data, such as the ones in data mining, statistics, and machine learning. To help addressing this gap, visual approaches have been proposed to combine human capabilities with consolidated solutions in the development of interactive tools that allow a more in-depth investigation of the data. A substantial amount of visual approaches has focused on items-based techniques, where the data items represent the first-order objects. Nevertheless, valuable knowledge frequently appears from observations of relationships between attributes of these data items, such as the relationships between numerical and categorical variables, which often encode relevant information. In this context, a visual analysis approach for attribute space exploration is paramount, both when there are hypotheses of correlations that must be confirmed, and also in cases where such relationships are unknown or unforeseen. In this Thesis, we propose an approach for attribute analysis based on the simultaneous presentation of multiple correlations through a point-based visualization aiming to build cognitive maps of these relationships to the end-user. Also, the analysis process then supports additional tasks such as feature selection and the development of prediction models based on a target outcome. We show the efficiency of the approaches through a series of case studies and usage scenarios involving real data sets in distinct contexts.
id USP_d6bf7799d01344033468b93f23a91b1e
oai_identifier_str oai:teses.usp.br:tde-31082020-175620
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str 2721
spelling Novel visual approaches for attribute analysis, selection, and predictionNovas abordagens visuais para análise, seleção e predição de atributosAnálise de espaço de atributosAnálise visualAnálise visual preditivaAttribute space analysisData visualizationFeature selectionPredictive visual analyticsSeleção de atributosVisual analyticsVisualização de dadosWhile data collection and storage capabilities grow widely nowadays, the general ability to process and analyze large amounts of data increases at a slower rate. This asynchrony introduces new challenges touching methods for large amounts of data, such as the ones in data mining, statistics, and machine learning. To help addressing this gap, visual approaches have been proposed to combine human capabilities with consolidated solutions in the development of interactive tools that allow a more in-depth investigation of the data. A substantial amount of visual approaches has focused on items-based techniques, where the data items represent the first-order objects. Nevertheless, valuable knowledge frequently appears from observations of relationships between attributes of these data items, such as the relationships between numerical and categorical variables, which often encode relevant information. In this context, a visual analysis approach for attribute space exploration is paramount, both when there are hypotheses of correlations that must be confirmed, and also in cases where such relationships are unknown or unforeseen. In this Thesis, we propose an approach for attribute analysis based on the simultaneous presentation of multiple correlations through a point-based visualization aiming to build cognitive maps of these relationships to the end-user. Also, the analysis process then supports additional tasks such as feature selection and the development of prediction models based on a target outcome. We show the efficiency of the approaches through a series of case studies and usage scenarios involving real data sets in distinct contexts.Enquanto as capacidades de coleta e armazenamento de dados crescem extensamente hoje em dia, a capacidade geral de processar e analisar grande quantidade de dados cresce em uma taxa mais lenta. Essa assincronia introduz novos desafios impactando métodos que lidam com essa enorme quantidade de dados, como abordagens em mineração, estatística e aprendizado de máquina. Para ajudar a diminuir esta lacuna, abordagens visuais vem sendo propostas para combinar habilidades humanas com soluções consolidadas no desenvolvimento de ferramentas interativas que permitem uma investigação mais aprofundada dos dados. Uma quantidade substancial de abordagens visuais se concentra em técnicas baseadas em itens, onde os itens de dados representam os objetos de primeira ordem. Contudo, informações valiosas frequentemente aparecem a partir de observações de relacionamentos entre atributos, como os relacionamentos entre atributos categóricos e numéricos que frequentemente codificam informações relevantes. Nesse contexto, uma abordagem de análise visual para a exploração do espaço de atributos é fundamental, tanto quando há hipóteses de correlações que devem ser confirmadas, como também nos casos em que tais relações são desconhecidas ou imprevisíveis. Nesta Tese, propomos uma abordagem para análise de atributos com base na apresentação simultânea de múltiplas correlações por meio de uma visualização baseada em pontos, a qual visa construir mapas cognitivos desses relacionamentos para o usuário final. Além disso, o processo de análise oferece suporte a tarefas adicionais como seleção de atributos e criação de modelos de predição com base em um resultado alvo. Mostramos a eficiência das abordagens através de uma série de estudos de caso e cenários de uso que envolvem conjuntos de dados em contextos distintos.Biblioteca Digitais de Teses e Dissertações da USPMinghim, RosaneJúnior, Erasmo Artur da Silva2020-06-26info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2020-09-01T00:03:01Zoai:teses.usp.br:tde-31082020-175620Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212020-09-01T00:03:01Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Novel visual approaches for attribute analysis, selection, and prediction
Novas abordagens visuais para análise, seleção e predição de atributos
title Novel visual approaches for attribute analysis, selection, and prediction
spellingShingle Novel visual approaches for attribute analysis, selection, and prediction
Júnior, Erasmo Artur da Silva
Análise de espaço de atributos
Análise visual
Análise visual preditiva
Attribute space analysis
Data visualization
Feature selection
Predictive visual analytics
Seleção de atributos
Visual analytics
Visualização de dados
title_short Novel visual approaches for attribute analysis, selection, and prediction
title_full Novel visual approaches for attribute analysis, selection, and prediction
title_fullStr Novel visual approaches for attribute analysis, selection, and prediction
title_full_unstemmed Novel visual approaches for attribute analysis, selection, and prediction
title_sort Novel visual approaches for attribute analysis, selection, and prediction
author Júnior, Erasmo Artur da Silva
author_facet Júnior, Erasmo Artur da Silva
author_role author
dc.contributor.none.fl_str_mv Minghim, Rosane
dc.contributor.author.fl_str_mv Júnior, Erasmo Artur da Silva
dc.subject.por.fl_str_mv Análise de espaço de atributos
Análise visual
Análise visual preditiva
Attribute space analysis
Data visualization
Feature selection
Predictive visual analytics
Seleção de atributos
Visual analytics
Visualização de dados
topic Análise de espaço de atributos
Análise visual
Análise visual preditiva
Attribute space analysis
Data visualization
Feature selection
Predictive visual analytics
Seleção de atributos
Visual analytics
Visualização de dados
description While data collection and storage capabilities grow widely nowadays, the general ability to process and analyze large amounts of data increases at a slower rate. This asynchrony introduces new challenges touching methods for large amounts of data, such as the ones in data mining, statistics, and machine learning. To help addressing this gap, visual approaches have been proposed to combine human capabilities with consolidated solutions in the development of interactive tools that allow a more in-depth investigation of the data. A substantial amount of visual approaches has focused on items-based techniques, where the data items represent the first-order objects. Nevertheless, valuable knowledge frequently appears from observations of relationships between attributes of these data items, such as the relationships between numerical and categorical variables, which often encode relevant information. In this context, a visual analysis approach for attribute space exploration is paramount, both when there are hypotheses of correlations that must be confirmed, and also in cases where such relationships are unknown or unforeseen. In this Thesis, we propose an approach for attribute analysis based on the simultaneous presentation of multiple correlations through a point-based visualization aiming to build cognitive maps of these relationships to the end-user. Also, the analysis process then supports additional tasks such as feature selection and the development of prediction models based on a target outcome. We show the efficiency of the approaches through a series of case studies and usage scenarios involving real data sets in distinct contexts.
publishDate 2020
dc.date.none.fl_str_mv 2020-06-26
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/
url https://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815257517617315840