Novel visual approaches for attribute analysis, selection, and prediction
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/ |
Resumo: | While data collection and storage capabilities grow widely nowadays, the general ability to process and analyze large amounts of data increases at a slower rate. This asynchrony introduces new challenges touching methods for large amounts of data, such as the ones in data mining, statistics, and machine learning. To help addressing this gap, visual approaches have been proposed to combine human capabilities with consolidated solutions in the development of interactive tools that allow a more in-depth investigation of the data. A substantial amount of visual approaches has focused on items-based techniques, where the data items represent the first-order objects. Nevertheless, valuable knowledge frequently appears from observations of relationships between attributes of these data items, such as the relationships between numerical and categorical variables, which often encode relevant information. In this context, a visual analysis approach for attribute space exploration is paramount, both when there are hypotheses of correlations that must be confirmed, and also in cases where such relationships are unknown or unforeseen. In this Thesis, we propose an approach for attribute analysis based on the simultaneous presentation of multiple correlations through a point-based visualization aiming to build cognitive maps of these relationships to the end-user. Also, the analysis process then supports additional tasks such as feature selection and the development of prediction models based on a target outcome. We show the efficiency of the approaches through a series of case studies and usage scenarios involving real data sets in distinct contexts. |
id |
USP_d6bf7799d01344033468b93f23a91b1e |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-31082020-175620 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
Novel visual approaches for attribute analysis, selection, and predictionNovas abordagens visuais para análise, seleção e predição de atributosAnálise de espaço de atributosAnálise visualAnálise visual preditivaAttribute space analysisData visualizationFeature selectionPredictive visual analyticsSeleção de atributosVisual analyticsVisualização de dadosWhile data collection and storage capabilities grow widely nowadays, the general ability to process and analyze large amounts of data increases at a slower rate. This asynchrony introduces new challenges touching methods for large amounts of data, such as the ones in data mining, statistics, and machine learning. To help addressing this gap, visual approaches have been proposed to combine human capabilities with consolidated solutions in the development of interactive tools that allow a more in-depth investigation of the data. A substantial amount of visual approaches has focused on items-based techniques, where the data items represent the first-order objects. Nevertheless, valuable knowledge frequently appears from observations of relationships between attributes of these data items, such as the relationships between numerical and categorical variables, which often encode relevant information. In this context, a visual analysis approach for attribute space exploration is paramount, both when there are hypotheses of correlations that must be confirmed, and also in cases where such relationships are unknown or unforeseen. In this Thesis, we propose an approach for attribute analysis based on the simultaneous presentation of multiple correlations through a point-based visualization aiming to build cognitive maps of these relationships to the end-user. Also, the analysis process then supports additional tasks such as feature selection and the development of prediction models based on a target outcome. We show the efficiency of the approaches through a series of case studies and usage scenarios involving real data sets in distinct contexts.Enquanto as capacidades de coleta e armazenamento de dados crescem extensamente hoje em dia, a capacidade geral de processar e analisar grande quantidade de dados cresce em uma taxa mais lenta. Essa assincronia introduz novos desafios impactando métodos que lidam com essa enorme quantidade de dados, como abordagens em mineração, estatística e aprendizado de máquina. Para ajudar a diminuir esta lacuna, abordagens visuais vem sendo propostas para combinar habilidades humanas com soluções consolidadas no desenvolvimento de ferramentas interativas que permitem uma investigação mais aprofundada dos dados. Uma quantidade substancial de abordagens visuais se concentra em técnicas baseadas em itens, onde os itens de dados representam os objetos de primeira ordem. Contudo, informações valiosas frequentemente aparecem a partir de observações de relacionamentos entre atributos, como os relacionamentos entre atributos categóricos e numéricos que frequentemente codificam informações relevantes. Nesse contexto, uma abordagem de análise visual para a exploração do espaço de atributos é fundamental, tanto quando há hipóteses de correlações que devem ser confirmadas, como também nos casos em que tais relações são desconhecidas ou imprevisíveis. Nesta Tese, propomos uma abordagem para análise de atributos com base na apresentação simultânea de múltiplas correlações por meio de uma visualização baseada em pontos, a qual visa construir mapas cognitivos desses relacionamentos para o usuário final. Além disso, o processo de análise oferece suporte a tarefas adicionais como seleção de atributos e criação de modelos de predição com base em um resultado alvo. Mostramos a eficiência das abordagens através de uma série de estudos de caso e cenários de uso que envolvem conjuntos de dados em contextos distintos.Biblioteca Digitais de Teses e Dissertações da USPMinghim, RosaneJúnior, Erasmo Artur da Silva2020-06-26info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2020-09-01T00:03:01Zoai:teses.usp.br:tde-31082020-175620Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212020-09-01T00:03:01Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Novel visual approaches for attribute analysis, selection, and prediction Novas abordagens visuais para análise, seleção e predição de atributos |
title |
Novel visual approaches for attribute analysis, selection, and prediction |
spellingShingle |
Novel visual approaches for attribute analysis, selection, and prediction Júnior, Erasmo Artur da Silva Análise de espaço de atributos Análise visual Análise visual preditiva Attribute space analysis Data visualization Feature selection Predictive visual analytics Seleção de atributos Visual analytics Visualização de dados |
title_short |
Novel visual approaches for attribute analysis, selection, and prediction |
title_full |
Novel visual approaches for attribute analysis, selection, and prediction |
title_fullStr |
Novel visual approaches for attribute analysis, selection, and prediction |
title_full_unstemmed |
Novel visual approaches for attribute analysis, selection, and prediction |
title_sort |
Novel visual approaches for attribute analysis, selection, and prediction |
author |
Júnior, Erasmo Artur da Silva |
author_facet |
Júnior, Erasmo Artur da Silva |
author_role |
author |
dc.contributor.none.fl_str_mv |
Minghim, Rosane |
dc.contributor.author.fl_str_mv |
Júnior, Erasmo Artur da Silva |
dc.subject.por.fl_str_mv |
Análise de espaço de atributos Análise visual Análise visual preditiva Attribute space analysis Data visualization Feature selection Predictive visual analytics Seleção de atributos Visual analytics Visualização de dados |
topic |
Análise de espaço de atributos Análise visual Análise visual preditiva Attribute space analysis Data visualization Feature selection Predictive visual analytics Seleção de atributos Visual analytics Visualização de dados |
description |
While data collection and storage capabilities grow widely nowadays, the general ability to process and analyze large amounts of data increases at a slower rate. This asynchrony introduces new challenges touching methods for large amounts of data, such as the ones in data mining, statistics, and machine learning. To help addressing this gap, visual approaches have been proposed to combine human capabilities with consolidated solutions in the development of interactive tools that allow a more in-depth investigation of the data. A substantial amount of visual approaches has focused on items-based techniques, where the data items represent the first-order objects. Nevertheless, valuable knowledge frequently appears from observations of relationships between attributes of these data items, such as the relationships between numerical and categorical variables, which often encode relevant information. In this context, a visual analysis approach for attribute space exploration is paramount, both when there are hypotheses of correlations that must be confirmed, and also in cases where such relationships are unknown or unforeseen. In this Thesis, we propose an approach for attribute analysis based on the simultaneous presentation of multiple correlations through a point-based visualization aiming to build cognitive maps of these relationships to the end-user. Also, the analysis process then supports additional tasks such as feature selection and the development of prediction models based on a target outcome. We show the efficiency of the approaches through a series of case studies and usage scenarios involving real data sets in distinct contexts. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-06-26 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/ |
url |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082020-175620/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1815257517617315840 |