Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification

Detalhes bibliográficos
Autor(a) principal: BRAGA, Pedro Henrique Magalhães
Data de Publicação: 2019
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFPE
Texto Completo: https://repositorio.ufpe.br/handle/123456789/33484
Resumo: In recent years, the advances in technology have produced datasets of increasing size, not only regarding the number of samples but also the number of features. Unfortunately, despite these advances, creating a sufficiently large amount of properly labeled data with enough examples for each class is not an easy task. Organizing and labeling such data is challenging, expensive, and time-consuming. Also, it is usually done manually, and people can label with different formats and styles, incorporating noise and errors to the dataset. Hence, there is a growing interest in semi-supervised learning, since, in many learning tasks, there is a plentiful supply of unlabeled data, but insufficient labeled ones. Therefore, at the current stage of research, it is of great importance to put forward semi-supervised learning models aiming to combine both types of data, in order to benefit from the distinct information they can provide, to obtain better performances of both clustering and classification tasks, that would expand the range of machine learning applications. Moreover, it is also important to develop methods that are easy to parameterize in a way that become robust to the different characteristics of the data at hand. In this sense, the Self-Organizing Maps (SOM) can be considered as good options to address such objectives. It is a biologically inspired neural model that uses unsupervised and incremental learning to produce prototypes of the input data. However, such an unsupervised characteristic makes it unfeasible for SOM to execute Semi-Supervised Learning. In that way, this Dissertation presents some new proposals based on SOM to perform Semi-Supervised learning tasks for both clustering and classification. It is done by introducing to SOM the standard concepts of Learning Vector Quantization (LVQ), which can be seen as its supervised counterpart, to build hybrid approaches. Such proposals can dynamically switch between the two types of learning at training time, according to the availability of labels and automatically adjust themselves to the local variance observed in each data cluster. In the course of this work, the experimental results show that the proposed models can surpass the performance of other traditional methods not only in terms of classification but also regarding clustering quality. It also enhances the range of possible applications of a SOM and LVQ-based models by combining them with recent and promising techniques from Deep Learning to solve more complex problems commonly found in such field.
id UFPE_060a7cf75d1a99073350aade58c8a677
oai_identifier_str oai:repositorio.ufpe.br:123456789/33484
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str 2221
spelling BRAGA, Pedro Henrique Magalhãeshttp://lattes.cnpq.br/2868489638143233http://lattes.cnpq.br/1931667959910637BASSANI, Hansenclever de França2019-09-23T18:08:25Z2019-09-23T18:08:25Z2019-02-26https://repositorio.ufpe.br/handle/123456789/33484In recent years, the advances in technology have produced datasets of increasing size, not only regarding the number of samples but also the number of features. Unfortunately, despite these advances, creating a sufficiently large amount of properly labeled data with enough examples for each class is not an easy task. Organizing and labeling such data is challenging, expensive, and time-consuming. Also, it is usually done manually, and people can label with different formats and styles, incorporating noise and errors to the dataset. Hence, there is a growing interest in semi-supervised learning, since, in many learning tasks, there is a plentiful supply of unlabeled data, but insufficient labeled ones. Therefore, at the current stage of research, it is of great importance to put forward semi-supervised learning models aiming to combine both types of data, in order to benefit from the distinct information they can provide, to obtain better performances of both clustering and classification tasks, that would expand the range of machine learning applications. Moreover, it is also important to develop methods that are easy to parameterize in a way that become robust to the different characteristics of the data at hand. In this sense, the Self-Organizing Maps (SOM) can be considered as good options to address such objectives. It is a biologically inspired neural model that uses unsupervised and incremental learning to produce prototypes of the input data. However, such an unsupervised characteristic makes it unfeasible for SOM to execute Semi-Supervised Learning. In that way, this Dissertation presents some new proposals based on SOM to perform Semi-Supervised learning tasks for both clustering and classification. It is done by introducing to SOM the standard concepts of Learning Vector Quantization (LVQ), which can be seen as its supervised counterpart, to build hybrid approaches. Such proposals can dynamically switch between the two types of learning at training time, according to the availability of labels and automatically adjust themselves to the local variance observed in each data cluster. In the course of this work, the experimental results show that the proposed models can surpass the performance of other traditional methods not only in terms of classification but also regarding clustering quality. It also enhances the range of possible applications of a SOM and LVQ-based models by combining them with recent and promising techniques from Deep Learning to solve more complex problems commonly found in such field.CNPqNos últimos anos, os avanços na tecnologia tem produzido conjuntos de dados de tamanhos cada vez maiores, não apenas em relação ao número de amostras, mas também ao número de características. Infelizmente, apesar desses avanços, criar uma quantidade suficientemente grande de dados, adequadamente rotulados com amostras suficientes para cada classe, não é uma tarefa fácil. Organizar e rotular esses dados é desafiador, caro e demorado. Além disso, por ser geralmente feito de forma manual, pessoas podem rotular com diferentes formatos e estilos, incorporando ruído e erro aos dados. Assim, há um crescente interesse em aprendizagem semi-supervisionada, uma vez que, em muitas tarefas de aprendizagem, existe uma abundante quantidade de dados não rotulados, em contrapartida aos rotulados. Portanto, no atual estágio de pesquisa, é de grande importância desenvolver modelos de aprendizagem semi-supervisionada, com o intuito de combinar os dois tipos de dados, a fim de se beneficar das distintas informações que eles podem fornecer. Dessa forma, é possível obter melhores desempenhos para ambas as tarefas de agrupamento e classificação, o que pode expandir a gama de aplicações em aprendizagem de máquina. Ainda, desenvolver modelos que sejam fáceis de parametrizar de tal maneira que se tornem robustos às diferentes características dos dados disponíveis também é relevante. Nesse sentido, Mapas Auto-Organizáveis (SOM) podem ser considerados boas opções. O SOM é um modelo neural, biologicamente inspirado, que usa aprendizagem não-supervisionada e incremental para produzir protótipos dos dados de entrada. No entanto, sua característica nãosupervisionada inviabiliza a realização de aprendizagem semi-supervisionada. Esta Dissertação apresenta algumas novas propostas de modelos baseados em SOM para realizar tarefas de aprendizagem semi-supervisionada tanto para agrupamento, como para classificação. Isso é feito introduzindo ao SOM conceitos da tradicional Quantização Ventorial (LVQ), que pode ser vista como sua versão supervisionada para construir abordagens híbridas. Tais propostas podem alternar dinamicamente entre duas formas de aprendizagem em tempo de treinamento, de acordo com a disponibilidade de rótulos, além de se ajustarem automaticamente às variâncias locais observadas em cada grupo de dados. No decorrer deste trabalho, os resultados experimentais mostram que os modelos propostos podem superar o desempenho de outros métodos tradicionais, não apenas em termos de classificção, mas também na qualidade de agrupamento. As propostas também aumentam a gama de possíveis aplicações de modelos baseados em SOM e LVQ, uma vez que os combinam com técnicas recentes e promissoras de aprendizagem profunda para resolver problemas mais complexos comumente encontrados em tal área.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em Ciencia da ComputacaoUFPEBrasilAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessInteligência ComputacionalMapas Auto-OrganizáveisAprendizagem Semi-SupervisionadaSemi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classificationinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPETHUMBNAILDISSERTAÇÃO Pedro Henrique Magalhães Braga.pdf.jpgDISSERTAÇÃO Pedro Henrique Magalhães Braga.pdf.jpgGenerated Thumbnailimage/jpeg1281https://repositorio.ufpe.br/bitstream/123456789/33484/5/DISSERTA%c3%87%c3%83O%20Pedro%20Henrique%20Magalh%c3%a3es%20Braga.pdf.jpgd1ccf97c52c9be1e7f7a5c4a01bdd306MD55ORIGINALDISSERTAÇÃO Pedro Henrique Magalhães Braga.pdfDISSERTAÇÃO Pedro Henrique Magalhães Braga.pdfapplication/pdf3516461https://repositorio.ufpe.br/bitstream/123456789/33484/1/DISSERTA%c3%87%c3%83O%20Pedro%20Henrique%20Magalh%c3%a3es%20Braga.pdf53b4ec5b9247fc14aa7965377a927e38MD51CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/33484/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-82310https://repositorio.ufpe.br/bitstream/123456789/33484/3/license.txtbd573a5ca8288eb7272482765f819534MD53TEXTDISSERTAÇÃO Pedro Henrique Magalhães Braga.pdf.txtDISSERTAÇÃO Pedro Henrique Magalhães Braga.pdf.txtExtracted texttext/plain202038https://repositorio.ufpe.br/bitstream/123456789/33484/4/DISSERTA%c3%87%c3%83O%20Pedro%20Henrique%20Magalh%c3%a3es%20Braga.pdf.txt48f54ce6e0a3d0ec2d7ae6bffbd23dc8MD54123456789/334842019-10-25 08:34:24.687oai:repositorio.ufpe.br:123456789/33484TGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKClRvZG8gZGVwb3NpdGFudGUgZGUgbWF0ZXJpYWwgbm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgKFJJKSBkZXZlIGNvbmNlZGVyLCDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIChVRlBFKSwgdW1hIExpY2Vuw6dhIGRlIERpc3RyaWJ1acOnw6NvIE7Do28gRXhjbHVzaXZhIHBhcmEgbWFudGVyIGUgdG9ybmFyIGFjZXNzw612ZWlzIG9zIHNldXMgZG9jdW1lbnRvcywgZW0gZm9ybWF0byBkaWdpdGFsLCBuZXN0ZSByZXBvc2l0w7NyaW8uCgpDb20gYSBjb25jZXNzw6NvIGRlc3RhIGxpY2Vuw6dhIG7Do28gZXhjbHVzaXZhLCBvIGRlcG9zaXRhbnRlIG1hbnTDqW0gdG9kb3Mgb3MgZGlyZWl0b3MgZGUgYXV0b3IuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwoKTGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKCkFvIGNvbmNvcmRhciBjb20gZXN0YSBsaWNlbsOnYSBlIGFjZWl0w6EtbGEsIHZvY8OqIChhdXRvciBvdSBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMpOgoKYSkgRGVjbGFyYSBxdWUgY29uaGVjZSBhIHBvbMOtdGljYSBkZSBjb3B5cmlnaHQgZGEgZWRpdG9yYSBkbyBzZXUgZG9jdW1lbnRvOwpiKSBEZWNsYXJhIHF1ZSBjb25oZWNlIGUgYWNlaXRhIGFzIERpcmV0cml6ZXMgcGFyYSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGUEU7CmMpIENvbmNlZGUgw6AgVUZQRSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZGUgYXJxdWl2YXIsIHJlcHJvZHV6aXIsIGNvbnZlcnRlciAoY29tbyBkZWZpbmlkbyBhIHNlZ3VpciksIGNvbXVuaWNhciBlL291IGRpc3RyaWJ1aXIsIG5vIFJJLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgcG9yIG91dHJvIG1laW87CmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgVUZQRSBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgcGFyYSBxdWFscXVlciBmb3JtYXRvIGRlIGZpY2hlaXJvLCBtZWlvIG91IHN1cG9ydGUsIHBhcmEgZWZlaXRvcyBkZSBzZWd1cmFuw6dhLCBwcmVzZXJ2YcOnw6NvIChiYWNrdXApIGUgYWNlc3NvOwplKSBEZWNsYXJhIHF1ZSBvIGRvY3VtZW50byBzdWJtZXRpZG8gw6kgbyBzZXUgdHJhYmFsaG8gb3JpZ2luYWwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBhIGVudHJlZ2EgZG8gZG9jdW1lbnRvIG7Do28gaW5mcmluZ2Ugb3MgZGlyZWl0b3MgZGUgb3V0cmEgcGVzc29hIG91IGVudGlkYWRlOwpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlCmF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIGlycmVzdHJpdGEgZG8gcmVzcGVjdGl2byBkZXRlbnRvciBkZXNzZXMgZGlyZWl0b3MgcGFyYSBjZWRlciDDoApVRlBFIG9zIGRpcmVpdG9zIHJlcXVlcmlkb3MgcG9yIGVzdGEgTGljZW7Dp2EgZSBhdXRvcml6YXIgYSB1bml2ZXJzaWRhZGUgYSB1dGlsaXrDoS1sb3MgbGVnYWxtZW50ZS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBlc3NlIG1hdGVyaWFsIGN1am9zIGRpcmVpdG9zIHPDo28gZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZTsKZykgU2UgbyBkb2N1bWVudG8gZW50cmVndWUgw6kgYmFzZWFkbyBlbSB0cmFiYWxobyBmaW5hbmNpYWRvIG91IGFwb2lhZG8gcG9yIG91dHJhIGluc3RpdHVpw6fDo28gcXVlIG7Do28gYSBVRlBFLCBkZWNsYXJhIHF1ZSBjdW1wcml1IHF1YWlzcXVlciBvYnJpZ2HDp8O1ZXMgZXhpZ2lkYXMgcGVsbyByZXNwZWN0aXZvIGNvbnRyYXRvIG91IGFjb3Jkby4KCkEgVUZQRSBpZGVudGlmaWNhcsOhIGNsYXJhbWVudGUgbyhzKSBub21lKHMpIGRvKHMpIGF1dG9yIChlcykgZG9zIGRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212019-10-25T11:34:24Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
title Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
spellingShingle Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
BRAGA, Pedro Henrique Magalhães
Inteligência Computacional
Mapas Auto-Organizáveis
Aprendizagem Semi-Supervisionada
title_short Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
title_full Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
title_fullStr Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
title_full_unstemmed Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
title_sort Semi-Supervised Self-Organizing Maps with Time-Varying Structures for Clustering and Classification
author BRAGA, Pedro Henrique Magalhães
author_facet BRAGA, Pedro Henrique Magalhães
author_role author
dc.contributor.authorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/2868489638143233
dc.contributor.advisorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/1931667959910637
dc.contributor.author.fl_str_mv BRAGA, Pedro Henrique Magalhães
dc.contributor.advisor1.fl_str_mv BASSANI, Hansenclever de França
contributor_str_mv BASSANI, Hansenclever de França
dc.subject.por.fl_str_mv Inteligência Computacional
Mapas Auto-Organizáveis
Aprendizagem Semi-Supervisionada
topic Inteligência Computacional
Mapas Auto-Organizáveis
Aprendizagem Semi-Supervisionada
description In recent years, the advances in technology have produced datasets of increasing size, not only regarding the number of samples but also the number of features. Unfortunately, despite these advances, creating a sufficiently large amount of properly labeled data with enough examples for each class is not an easy task. Organizing and labeling such data is challenging, expensive, and time-consuming. Also, it is usually done manually, and people can label with different formats and styles, incorporating noise and errors to the dataset. Hence, there is a growing interest in semi-supervised learning, since, in many learning tasks, there is a plentiful supply of unlabeled data, but insufficient labeled ones. Therefore, at the current stage of research, it is of great importance to put forward semi-supervised learning models aiming to combine both types of data, in order to benefit from the distinct information they can provide, to obtain better performances of both clustering and classification tasks, that would expand the range of machine learning applications. Moreover, it is also important to develop methods that are easy to parameterize in a way that become robust to the different characteristics of the data at hand. In this sense, the Self-Organizing Maps (SOM) can be considered as good options to address such objectives. It is a biologically inspired neural model that uses unsupervised and incremental learning to produce prototypes of the input data. However, such an unsupervised characteristic makes it unfeasible for SOM to execute Semi-Supervised Learning. In that way, this Dissertation presents some new proposals based on SOM to perform Semi-Supervised learning tasks for both clustering and classification. It is done by introducing to SOM the standard concepts of Learning Vector Quantization (LVQ), which can be seen as its supervised counterpart, to build hybrid approaches. Such proposals can dynamically switch between the two types of learning at training time, according to the availability of labels and automatically adjust themselves to the local variance observed in each data cluster. In the course of this work, the experimental results show that the proposed models can surpass the performance of other traditional methods not only in terms of classification but also regarding clustering quality. It also enhances the range of possible applications of a SOM and LVQ-based models by combining them with recent and promising techniques from Deep Learning to solve more complex problems commonly found in such field.
publishDate 2019
dc.date.accessioned.fl_str_mv 2019-09-23T18:08:25Z
dc.date.available.fl_str_mv 2019-09-23T18:08:25Z
dc.date.issued.fl_str_mv 2019-02-26
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/33484
url https://repositorio.ufpe.br/handle/123456789/33484
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv Programa de Pos Graduacao em Ciencia da Computacao
dc.publisher.initials.fl_str_mv UFPE
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
bitstream.url.fl_str_mv https://repositorio.ufpe.br/bitstream/123456789/33484/5/DISSERTA%c3%87%c3%83O%20Pedro%20Henrique%20Magalh%c3%a3es%20Braga.pdf.jpg
https://repositorio.ufpe.br/bitstream/123456789/33484/1/DISSERTA%c3%87%c3%83O%20Pedro%20Henrique%20Magalh%c3%a3es%20Braga.pdf
https://repositorio.ufpe.br/bitstream/123456789/33484/2/license_rdf
https://repositorio.ufpe.br/bitstream/123456789/33484/3/license.txt
https://repositorio.ufpe.br/bitstream/123456789/33484/4/DISSERTA%c3%87%c3%83O%20Pedro%20Henrique%20Magalh%c3%a3es%20Braga.pdf.txt
bitstream.checksum.fl_str_mv d1ccf97c52c9be1e7f7a5c4a01bdd306
53b4ec5b9247fc14aa7965377a927e38
e39d27027a6cc9cb039ad269a5db8e34
bd573a5ca8288eb7272482765f819534
48f54ce6e0a3d0ec2d7ae6bffbd23dc8
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1802310766310719488