Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images

Detalhes bibliográficos
Autor(a) principal: Matheus Barros Pereira
Data de Publicação: 2019
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFMG
Texto Completo: http://hdl.handle.net/1843/36706
Resumo: High-resolution aerial images are desirable for most of the deep-based remote sensing applications. This type of data, however, is not always accessible or affordable. On the other hand, coarse resolution remote sensing images, such as LANDSAT and MODIS, are easily found in public open repositories and, therefore, are widely used in many studies. The problem is that the amount of spacial information compressed into one single pixel in a low-resolution representation can compromise pattern recognition algorithms. Thus, the use of coarse-resolution data for automatic creation of thematic maps is very restricted since most of the deep-based semantic segmentation (a.k.a dense labeling) approaches are only suitable for subdecimeter data. Super-resolution is a classic computer vision problem that aims to restore the quality of degraded, low-resolution images. In this work, we design two frameworks in order to evaluate the effectiveness of deep-based super-resolution in the semantic segmentation of low-resolution remote sensing images. Our objective is to evaluate how effective is deep-based super-resolution to different levels of degradation, how it compares to unsupervised bicubic interpolation and if it is able to reconstruct small objects and, consequently, contribute to semantic segmentation improvement. The first framework uses super-resolution as a pre-processing step for the semantic segmentation task (two-stage framework). The second framework is an end-to-end approach that trains both networks at the same time while sharing their losses. We carried out an extensive set of experiments on remote sensing datasets with distinct nature and properties. For the agricultural dataset of coffee mapping, which only contains two labels (coffee and non-coffee), the use of low-resolution images achieved only 50% normalized accuracy with 8x up-scaling factor. The two stage framework with super-resolution in the same condition increased this value to 72%. The end-to-end framework further increased the value to 77%, compared to 81% of high-resolution data. For the urban dataset of Vaihingen, using super-resolution in the two stage framework increased the accuracy of car segmentation from 19% to 58% with 8x up-scaling factor, while the end-to-end framework achieved 65%. In this case, with high-resolution data, the accuracy was 69%, which is not far from the super-resolution result. Both cases are examples of how super-resolution is able to recover important texture details (for coffee crops, for example) and is also able to make more discernible small objects that were difficult to see in a low-resolution representation (such as cars). The results show that super-resolution is effective to improve semantic segmentation performance on low-resolution aerial imagery. It not only outperforms unsupervised interpolation but also achieves semantic segmentation results comparable to high-resolution data. Even with a few training data, the use of the frameworks still achieved better results than bicubic interpolation. Thus, using super-resolution has proven to be a more effective approach than directly inputting low-resolution images to a semantic segmentation network. This is especially true for high degrading factors, which are the cases that super-resolution surpasses more the performance of low-resolution data.
id UFMG_abe5d33428e40c2c64eb4c3e903ee0f6
oai_identifier_str oai:repositorio.ufmg.br:1843/36706
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Jefersson Alex dos Santoshttp://lattes.cnpq.br/2171782600728348George Luiz Medeiros TeodoroAndré Vital SaúdeWesley Nunes Gonçalveshttp://lattes.cnpq.br/5014941996905154Matheus Barros Pereira2021-07-09T15:03:16Z2021-07-09T15:03:16Z2019-11-11http://hdl.handle.net/1843/36706High-resolution aerial images are desirable for most of the deep-based remote sensing applications. This type of data, however, is not always accessible or affordable. On the other hand, coarse resolution remote sensing images, such as LANDSAT and MODIS, are easily found in public open repositories and, therefore, are widely used in many studies. The problem is that the amount of spacial information compressed into one single pixel in a low-resolution representation can compromise pattern recognition algorithms. Thus, the use of coarse-resolution data for automatic creation of thematic maps is very restricted since most of the deep-based semantic segmentation (a.k.a dense labeling) approaches are only suitable for subdecimeter data. Super-resolution is a classic computer vision problem that aims to restore the quality of degraded, low-resolution images. In this work, we design two frameworks in order to evaluate the effectiveness of deep-based super-resolution in the semantic segmentation of low-resolution remote sensing images. Our objective is to evaluate how effective is deep-based super-resolution to different levels of degradation, how it compares to unsupervised bicubic interpolation and if it is able to reconstruct small objects and, consequently, contribute to semantic segmentation improvement. The first framework uses super-resolution as a pre-processing step for the semantic segmentation task (two-stage framework). The second framework is an end-to-end approach that trains both networks at the same time while sharing their losses. We carried out an extensive set of experiments on remote sensing datasets with distinct nature and properties. For the agricultural dataset of coffee mapping, which only contains two labels (coffee and non-coffee), the use of low-resolution images achieved only 50% normalized accuracy with 8x up-scaling factor. The two stage framework with super-resolution in the same condition increased this value to 72%. The end-to-end framework further increased the value to 77%, compared to 81% of high-resolution data. For the urban dataset of Vaihingen, using super-resolution in the two stage framework increased the accuracy of car segmentation from 19% to 58% with 8x up-scaling factor, while the end-to-end framework achieved 65%. In this case, with high-resolution data, the accuracy was 69%, which is not far from the super-resolution result. Both cases are examples of how super-resolution is able to recover important texture details (for coffee crops, for example) and is also able to make more discernible small objects that were difficult to see in a low-resolution representation (such as cars). The results show that super-resolution is effective to improve semantic segmentation performance on low-resolution aerial imagery. It not only outperforms unsupervised interpolation but also achieves semantic segmentation results comparable to high-resolution data. Even with a few training data, the use of the frameworks still achieved better results than bicubic interpolation. Thus, using super-resolution has proven to be a more effective approach than directly inputting low-resolution images to a semantic segmentation network. This is especially true for high degrading factors, which are the cases that super-resolution surpasses more the performance of low-resolution data.Imagens aéreas de alta resolução são desejáveis para a maior parte das aplicações de sensoriamento remoto baseadas em algoritmos profundos. Esse tipo de dado, contudo, nem sempre é acessível. Por outro lado, imagens de sensoriamento remoto de baixa/média resolução, como as dos satélites LANDSAT e MODIS, são facilmente encontradas em repositórios públicos abertos e, portanto, são usadas em diversos estudos. O problema é que a quantidade de informação espacial comprimida em um único pixel em uma representação de baixa resolução pode comprometer algoritmos de reconhecimento de padrão. Assim, o uso de dados de baixa resolução para a criação automática de mapas temáticos é muito restrito, dado que a maioria das abordagens baseadas em algoritmos profundos para segmentação semântica (ou rotulação densa) são adequadas apenas para dados subdecimais. Super-resolução é um problema clássico de visão computacional que busca restaurar a qualidade de imagens de baixa resolução. No presente trabalho, foram desenvolvidos dois arcabouços que têm como objetivo avaliar a efetividade de super-resolução baseada em algoritmos profundos na segmentação semântica de imagens de sensoriamento remoto de baixa resolução. Visa-se avaliar quão efetivo é a super-resolução em diferentes níveis de degradação, como se compara com interpolação bicúbica não-supervisionada e se é capaz de reconstruir objetos pequenos e, consequentemente, contribuir para o melhoramento da segmentação semântica. O primeiro arcabouço usa super-resolução como um pré-processamento para a tarefa de segmentação semântica. O segundo arcabouço é uma abordagem unificada que treina as duas redes ao mesmo tempo enquanto compartilha suas funções de erro. Foram executados um conjunto extensivo de experimentos em dados de sensoriamento remoto com natureza e propriedades distintas. Para o conjunto de dados agriculturais de mapeamento de café, que contém apenas duas classes (café e não-café), o uso de imagens de baixa resolução alcançou apenas 50% de acurácia normalizada com taxa de aumento de 8 vezes. O arcabouço em dois estágios na mesma condição aumentou esse valor para 72%. O arcabouço unificado aumentou ainda mais esse valor para 77%, comparado aos 81% com dados de alta resolução. Para o conjunto de dados urbano de Vaihingen, usar super-resolução no arcabouço de dois estágios aumentou a acurácia de segmentação de carros de 19% para 58% com taxa de aumento de 8 vezes, enquanto o arcabouço unificado alcançou 65%. Nesse caso, com dados de alta resolução, a acurácia foi de 69%, o que não está distante dos resultados de super-resolução. Ambos os casos são exemplos de como super-resolução é capaz de recuperar detalhes de textura importantes (para plantações de café, por exemplo) e também é capaz de fazer ficarem mais claros objetos que eram difíceis de enxergar em uma representação de baixa resolução (como os carros). Os resultados mostram que super-resolução é efetiva para melhorar o desempenho de segmentação semântica em imagens aéreas de baixa resolução. Super-resolução não apenas é melhor que interpolação não-supervisionada, como também alcança resultados de segmentação semântica comparáveis a dados de alta resolução. Mesmo com pouco dado de treinamento, o uso dos arcabouços alcançou resultados melhores que usando interpolação bicúbica. Dessa forma, o uso de super-resolução se provou ser mais efetivo do que aplicar imagens de baixa resolução em uma rede neural de segmentação semântica. Isso é verdade especialmente para altos fatores de degradação, os quais são os casos em que super-resolução supera mais o desempenho de se usar diretamente dados de baixa resolução.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em Ciência da ComputaçãoUFMGBrasilICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃOComputação – TesesVisão por computador – TesesSuper-resolução – TesesSensoriamento remoto – TesesSegmentação semântica – TesesRemote sensingSuper resolutionSemantic segmentationMapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution imagesMapeando o invisível: explorando super-resolução para segmentação semântica em imagens de baixa resoluçãoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALdissertacao_final_review2.pdfdissertacao_final_review2.pdfapplication/pdf19505437https://repositorio.ufmg.br/bitstream/1843/36706/3/dissertacao_final_review2.pdfbac7e982f6e0a6c2f4c0bc4d60a1a6b1MD53LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/36706/4/license.txtcda590c95a0b51b4d15f60c9642ca272MD541843/367062021-07-09 12:03:16.943oai:repositorio.ufmg.br:1843/36706TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2021-07-09T15:03:16Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
dc.title.alternative.pt_BR.fl_str_mv Mapeando o invisível: explorando super-resolução para segmentação semântica em imagens de baixa resolução
title Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
spellingShingle Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
Matheus Barros Pereira
Remote sensing
Super resolution
Semantic segmentation
Computação – Teses
Visão por computador – Teses
Super-resolução – Teses
Sensoriamento remoto – Teses
Segmentação semântica – Teses
title_short Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
title_full Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
title_fullStr Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
title_full_unstemmed Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
title_sort Mapping the unseen: exploiting super-resolution for semantic segmentation in low-resolution images
author Matheus Barros Pereira
author_facet Matheus Barros Pereira
author_role author
dc.contributor.advisor1.fl_str_mv Jefersson Alex dos Santos
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/2171782600728348
dc.contributor.referee1.fl_str_mv George Luiz Medeiros Teodoro
dc.contributor.referee2.fl_str_mv André Vital Saúde
dc.contributor.referee3.fl_str_mv Wesley Nunes Gonçalves
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/5014941996905154
dc.contributor.author.fl_str_mv Matheus Barros Pereira
contributor_str_mv Jefersson Alex dos Santos
George Luiz Medeiros Teodoro
André Vital Saúde
Wesley Nunes Gonçalves
dc.subject.por.fl_str_mv Remote sensing
Super resolution
Semantic segmentation
topic Remote sensing
Super resolution
Semantic segmentation
Computação – Teses
Visão por computador – Teses
Super-resolução – Teses
Sensoriamento remoto – Teses
Segmentação semântica – Teses
dc.subject.other.pt_BR.fl_str_mv Computação – Teses
Visão por computador – Teses
Super-resolução – Teses
Sensoriamento remoto – Teses
Segmentação semântica – Teses
description High-resolution aerial images are desirable for most of the deep-based remote sensing applications. This type of data, however, is not always accessible or affordable. On the other hand, coarse resolution remote sensing images, such as LANDSAT and MODIS, are easily found in public open repositories and, therefore, are widely used in many studies. The problem is that the amount of spacial information compressed into one single pixel in a low-resolution representation can compromise pattern recognition algorithms. Thus, the use of coarse-resolution data for automatic creation of thematic maps is very restricted since most of the deep-based semantic segmentation (a.k.a dense labeling) approaches are only suitable for subdecimeter data. Super-resolution is a classic computer vision problem that aims to restore the quality of degraded, low-resolution images. In this work, we design two frameworks in order to evaluate the effectiveness of deep-based super-resolution in the semantic segmentation of low-resolution remote sensing images. Our objective is to evaluate how effective is deep-based super-resolution to different levels of degradation, how it compares to unsupervised bicubic interpolation and if it is able to reconstruct small objects and, consequently, contribute to semantic segmentation improvement. The first framework uses super-resolution as a pre-processing step for the semantic segmentation task (two-stage framework). The second framework is an end-to-end approach that trains both networks at the same time while sharing their losses. We carried out an extensive set of experiments on remote sensing datasets with distinct nature and properties. For the agricultural dataset of coffee mapping, which only contains two labels (coffee and non-coffee), the use of low-resolution images achieved only 50% normalized accuracy with 8x up-scaling factor. The two stage framework with super-resolution in the same condition increased this value to 72%. The end-to-end framework further increased the value to 77%, compared to 81% of high-resolution data. For the urban dataset of Vaihingen, using super-resolution in the two stage framework increased the accuracy of car segmentation from 19% to 58% with 8x up-scaling factor, while the end-to-end framework achieved 65%. In this case, with high-resolution data, the accuracy was 69%, which is not far from the super-resolution result. Both cases are examples of how super-resolution is able to recover important texture details (for coffee crops, for example) and is also able to make more discernible small objects that were difficult to see in a low-resolution representation (such as cars). The results show that super-resolution is effective to improve semantic segmentation performance on low-resolution aerial imagery. It not only outperforms unsupervised interpolation but also achieves semantic segmentation results comparable to high-resolution data. Even with a few training data, the use of the frameworks still achieved better results than bicubic interpolation. Thus, using super-resolution has proven to be a more effective approach than directly inputting low-resolution images to a semantic segmentation network. This is especially true for high degrading factors, which are the cases that super-resolution surpasses more the performance of low-resolution data.
publishDate 2019
dc.date.issued.fl_str_mv 2019-11-11
dc.date.accessioned.fl_str_mv 2021-07-09T15:03:16Z
dc.date.available.fl_str_mv 2021-07-09T15:03:16Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1843/36706
url http://hdl.handle.net/1843/36706
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação
dc.publisher.initials.fl_str_mv UFMG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
bitstream.url.fl_str_mv https://repositorio.ufmg.br/bitstream/1843/36706/3/dissertacao_final_review2.pdf
https://repositorio.ufmg.br/bitstream/1843/36706/4/license.txt
bitstream.checksum.fl_str_mv bac7e982f6e0a6c2f4c0bc4d60a1a6b1
cda590c95a0b51b4d15f60c9642ca272
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_ 1797971093795897344