Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/45/45134/tde-22032024-184659/ |
Resumo: | Urban image classification at the street-level poses significant challenges due to the presence of diverse elements, varying appearances, and complex poses. Factors such as occlusion, background clutter, environmental conditions, and camera viewpoints further complicate the classification process. In this study, we leverage the capabilities of state-of-the-art Deep Learning Networks (DLNs), including MobileNets, ResNets, DenseNets, and EfficientNets, to tackle these challenges head-on. We aim to evaluate the performance of these DLNs, identify limitations, and propose innovative techniques for overcoming them. Our research focuses on the specific task of classifying urban images with or without trees near overhead powerlines. Through an extensive exploration, we provide methods and insights that not only address this classification problem but also offer generalizable solutions applicable to a range of classification tasks. Two major contributions are introduced in our work. Firstly, we extend the INvestigate and Analyze a CITY (INACITY) platform by integrating a graph-oriented database, improving the performance and coverage of urban image collection from Google Street View. Secondly, we develop the Street-Level Image Labeler (SLIL) tool, which efficiently mitigates the manual labeling burden, facilitating dataset creation. With the help of INACITY and SLIL, we curate a comprehensive labeled dataset comprising 8,800 street-level urban images. Human evaluation of the dataset reveals the presence of challenging images that perplex even experienced classifiers. For example, distinguishing whether powerlines intersect or pass behind tree canopies can be difficult depending on the perspective. The comparison of state-of-the-art DLNs on this dataset reveals that the highest accuracy achieved by plain DLNs is 74.6%. However, by introducing a new class \\emph distinct from positive or negative, and employing the noisy student training protocol and focal loss, we effectively enhance the recall rates for positive and negative classes respectively from 66.5% and 63.7% to 83.7% and 78.8%. This approach enables us to better identify and classify images that were previously prone to misclassification. |
id |
USP_4fff26025b6445ea3c14110a2ab236ea |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-22032024-184659 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power linesSuperando imagens urbanas desafiantes: métodos de aprendizagem profunda e integração de dados para deteção de emaranhamentos entre árvores e fios elétricosAprendizagem profundaComputer visionDeep learningDificuldade de instânciaImagens urbanasInstance hardnessUrban imagesVisão computacionalUrban image classification at the street-level poses significant challenges due to the presence of diverse elements, varying appearances, and complex poses. Factors such as occlusion, background clutter, environmental conditions, and camera viewpoints further complicate the classification process. In this study, we leverage the capabilities of state-of-the-art Deep Learning Networks (DLNs), including MobileNets, ResNets, DenseNets, and EfficientNets, to tackle these challenges head-on. We aim to evaluate the performance of these DLNs, identify limitations, and propose innovative techniques for overcoming them. Our research focuses on the specific task of classifying urban images with or without trees near overhead powerlines. Through an extensive exploration, we provide methods and insights that not only address this classification problem but also offer generalizable solutions applicable to a range of classification tasks. Two major contributions are introduced in our work. Firstly, we extend the INvestigate and Analyze a CITY (INACITY) platform by integrating a graph-oriented database, improving the performance and coverage of urban image collection from Google Street View. Secondly, we develop the Street-Level Image Labeler (SLIL) tool, which efficiently mitigates the manual labeling burden, facilitating dataset creation. With the help of INACITY and SLIL, we curate a comprehensive labeled dataset comprising 8,800 street-level urban images. Human evaluation of the dataset reveals the presence of challenging images that perplex even experienced classifiers. For example, distinguishing whether powerlines intersect or pass behind tree canopies can be difficult depending on the perspective. The comparison of state-of-the-art DLNs on this dataset reveals that the highest accuracy achieved by plain DLNs is 74.6%. However, by introducing a new class \\emph distinct from positive or negative, and employing the noisy student training protocol and focal loss, we effectively enhance the recall rates for positive and negative classes respectively from 66.5% and 63.7% to 83.7% and 78.8%. This approach enables us to better identify and classify images that were previously prone to misclassification.A classificação de imagens urbanas em nível de rua apresenta desafios devido à presença de diversos elementos, aparências variadas e poses complexas. Fatores como oclusão, confusão de fundo, condições climáticas e pontos de vista da câmera complicam ainda mais o processo de classificação. Neste estudo, aproveitamos as capacidades de redes de aprendizado profundo recentes, incluindo MobileNets, ResNets, DenseNets e EfficientNets, para enfrentar esses desafios. Nosso objetivo é avaliar o desempenho dessas redes, identificar limitações e propor novas técnicas para superá-las. Nossa pesquisa se concentra na tarefa específica de classificar imagens urbanas com ou sem árvores próximas à rede elétrica. Através de uma exploração extensiva, fornecemos métodos e insights úteis não só para esse problema de classificação, mas também aplicáveis a tarefas de classificação em outros domínios. Duas contribuições principais são introduzidas em nosso trabalho. Em primeiro lugar, ampliamos a plataforma INvestigate and Analyze a CITY (INACITY) integrando um banco de dados orientado a grafos, melhorando o desempenho e a cobertura da coleta de imagens urbanas com o Google Street View. Em segundo lugar, desenvolvemos a ferramenta Street-Level Image Labeler (SLIL), que reduz eficientemente o ônus de rotular imagens manualmente, facilitando a criação de conjuntos de dados. Com a ajuda do INACITY e do SLIL, criamos um conjunto de dados abrangente com 8.800 imagens urbanas em nível de rua rotuladas binariamente como contendo árvores próximas à rede elétrica (i.e. classe positiva) ou não (i.e. classe negativa). A avaliação humana do conjunto de dados revela a presença de imagens desafiadoras que confundem até mesmo classificadores experientes. Por exemplo, distinguir se fios de poste cruzam ou passam por trás das copas das árvores pode ser difícil, dependendo do ponto de vista da câmera. A comparação de redes neurais profundas recentes nesse conjunto de dados revela que a maior precisão alcançada por redes comuns é de 74,6%. No entanto, ao introduzir uma nova classe distinta da positiva ou negativa, a classe \\emph, e empregar o protocolo de treinamento \\emph{Noisy Student} e a função de custo \\emph{Focal Loss}, melhoramos efetivamente as taxas de revocação para as classe positiva de 66,5% para 83,7% e para a classe negativa de 63,7% para 78,8%. Essa abordagem nos permite identificar e classificar melhor imagens que anteriormente eram propensas a classificações incorretas.Biblioteca Digitais de Teses e Dissertações da USPHirata Junior, RobertoOliveira, Artur Andre Almeida de Macedo2023-08-11info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/45/45134/tde-22032024-184659/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-03-25T16:57:02Zoai:teses.usp.br:tde-22032024-184659Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212024-03-25T16:57:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines Superando imagens urbanas desafiantes: métodos de aprendizagem profunda e integração de dados para deteção de emaranhamentos entre árvores e fios elétricos |
title |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines |
spellingShingle |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines Oliveira, Artur Andre Almeida de Macedo Aprendizagem profunda Computer vision Deep learning Dificuldade de instância Imagens urbanas Instance hardness Urban images Visão computacional |
title_short |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines |
title_full |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines |
title_fullStr |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines |
title_full_unstemmed |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines |
title_sort |
Overcoming challenging crban images: deep learning and data integration methods for detecting trees entangled with power lines |
author |
Oliveira, Artur Andre Almeida de Macedo |
author_facet |
Oliveira, Artur Andre Almeida de Macedo |
author_role |
author |
dc.contributor.none.fl_str_mv |
Hirata Junior, Roberto |
dc.contributor.author.fl_str_mv |
Oliveira, Artur Andre Almeida de Macedo |
dc.subject.por.fl_str_mv |
Aprendizagem profunda Computer vision Deep learning Dificuldade de instância Imagens urbanas Instance hardness Urban images Visão computacional |
topic |
Aprendizagem profunda Computer vision Deep learning Dificuldade de instância Imagens urbanas Instance hardness Urban images Visão computacional |
description |
Urban image classification at the street-level poses significant challenges due to the presence of diverse elements, varying appearances, and complex poses. Factors such as occlusion, background clutter, environmental conditions, and camera viewpoints further complicate the classification process. In this study, we leverage the capabilities of state-of-the-art Deep Learning Networks (DLNs), including MobileNets, ResNets, DenseNets, and EfficientNets, to tackle these challenges head-on. We aim to evaluate the performance of these DLNs, identify limitations, and propose innovative techniques for overcoming them. Our research focuses on the specific task of classifying urban images with or without trees near overhead powerlines. Through an extensive exploration, we provide methods and insights that not only address this classification problem but also offer generalizable solutions applicable to a range of classification tasks. Two major contributions are introduced in our work. Firstly, we extend the INvestigate and Analyze a CITY (INACITY) platform by integrating a graph-oriented database, improving the performance and coverage of urban image collection from Google Street View. Secondly, we develop the Street-Level Image Labeler (SLIL) tool, which efficiently mitigates the manual labeling burden, facilitating dataset creation. With the help of INACITY and SLIL, we curate a comprehensive labeled dataset comprising 8,800 street-level urban images. Human evaluation of the dataset reveals the presence of challenging images that perplex even experienced classifiers. For example, distinguishing whether powerlines intersect or pass behind tree canopies can be difficult depending on the perspective. The comparison of state-of-the-art DLNs on this dataset reveals that the highest accuracy achieved by plain DLNs is 74.6%. However, by introducing a new class \\emph distinct from positive or negative, and employing the noisy student training protocol and focal loss, we effectively enhance the recall rates for positive and negative classes respectively from 66.5% and 63.7% to 83.7% and 78.8%. This approach enables us to better identify and classify images that were previously prone to misclassification. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-08-11 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/45/45134/tde-22032024-184659/ |
url |
https://www.teses.usp.br/teses/disponiveis/45/45134/tde-22032024-184659/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1815257265990533120 |