Multi-layer analysis of convolutional neural networks for transfer learning applications
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/55/55134/tde-25072022-165116/ |
Resumo: | Deep learning has become a hot topic in artificial intelligence due to its ability to model complex concepts from simple ones. In this regard, the convolutional neural network (CNN) is one of the most popular kinds of neural networks currently used in computer vision and related areas. In general, the following factors contributed to its popularity. (i) With enough data, most CNNs can be trained from scratch and learn powerful representations that solve the task at stake. (ii) On the other hand, with a limited volume of data, it is possible to also learn powerful representations by adapting the knowledge of a pre-trained CNN model via a transfer learning strategy. As a result, CNNs have advanced the state-of-the-art in many visual recognition tasks, leading to numerous applications in various fields outside of computer science, such as medicine and biology. Nevertheless, many of the best research efforts are focused on improving the state-of-the-art on a few datasets, such as ImageNet for image classification and COCO for object detection. On the other hand, research progress in many other domains is reduced to blindly applying existing approaches or re-inventing everything from scratch, resulting in the development of flawed methods in both cases. Therefore, this thesis focuses on understanding through systematic experiments why and when a pre-trained CNN model underperforms on a given task, to propose suitable solutions. In the first part of our study, we examined the task of texture recognition and discovered that all previous studies tended to focus exclusively on category-based texture datasets, leading to the misconception that only the deepest layers had the texture information needed to solve that task. We then show, by proposing multilayer transfer learning strategies, that the contribution of shallow layers is not trivial and should be used in certain applications. In the second part of our study, we focus on challenging object detection tasks (pollen grain detection and stomata localization), where we observe a situation similar to that of texture recognition. Therefore, in both cases, we also applied multilayer analysis to propose fast single-stage detectors that can handle large images accurately and efficiently. |
id |
USP_3ed98e987887b0325ffc0050f802b07b |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-25072022-165116 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
Multi-layer analysis of convolutional neural networks for transfer learning applicationsAnálise multicamada de redes neurais convolucionais para aplicações de transferência de conhecimentoActivation mapsClassificação de imagensComputer visionConvolutional neural networksDetecção de objetosImage classificationMapas de ativaçãoObject detectionRedes neurais convolucionaisTransfer learningTransferência de conhecimentoVisão por computadorDeep learning has become a hot topic in artificial intelligence due to its ability to model complex concepts from simple ones. In this regard, the convolutional neural network (CNN) is one of the most popular kinds of neural networks currently used in computer vision and related areas. In general, the following factors contributed to its popularity. (i) With enough data, most CNNs can be trained from scratch and learn powerful representations that solve the task at stake. (ii) On the other hand, with a limited volume of data, it is possible to also learn powerful representations by adapting the knowledge of a pre-trained CNN model via a transfer learning strategy. As a result, CNNs have advanced the state-of-the-art in many visual recognition tasks, leading to numerous applications in various fields outside of computer science, such as medicine and biology. Nevertheless, many of the best research efforts are focused on improving the state-of-the-art on a few datasets, such as ImageNet for image classification and COCO for object detection. On the other hand, research progress in many other domains is reduced to blindly applying existing approaches or re-inventing everything from scratch, resulting in the development of flawed methods in both cases. Therefore, this thesis focuses on understanding through systematic experiments why and when a pre-trained CNN model underperforms on a given task, to propose suitable solutions. In the first part of our study, we examined the task of texture recognition and discovered that all previous studies tended to focus exclusively on category-based texture datasets, leading to the misconception that only the deepest layers had the texture information needed to solve that task. We then show, by proposing multilayer transfer learning strategies, that the contribution of shallow layers is not trivial and should be used in certain applications. In the second part of our study, we focus on challenging object detection tasks (pollen grain detection and stomata localization), where we observe a situation similar to that of texture recognition. Therefore, in both cases, we also applied multilayer analysis to propose fast single-stage detectors that can handle large images accurately and efficiently.O aprendizado profundo tornou-se um tema quente na inteligência artificial devido à sua capacidade de modelar conceitos complexos a partir de conceitos simples. Nesse sentido, a rede neural convolucional (CNN) é um dos tipos mais populares de redes neurais atualmente utilizadas em visão computacional e áreas afins. Em geral, os seguintes fatores contribuíram para sua popularidade. (i) Com dados suficientes, a maioria das CNNs podem ser treinadas do zero e aprender representações poderosas que resolvem a tarefa em jogo. (ii) Por outro lado, com um volume limitado de dados, é possível também aprender representações poderosas adaptando o conhecimento de um modelo CNN pré-treinado por meio de uma estratégia de aprendizagem por transferência. Como resultado, as CNNs avançaram o estado da arte em muitas tarefas de reconhecimento visual, levando a inúmeras aplicações em vários campos fora da ciência da computação, como medicina e biologia. No entanto, muitos dos melhores esforços de pesquisa estão focados em melhorar o estado da arte só em alguns conjuntos de dados, como ImageNet para classificação de imagens e COCO para detecção de objetos. Porém, o progresso da pesquisa em muitos outros domínios é reduzido a aplicar cegamente as abordagens existentes ou reinventar tudo do zero, resultando no desenvolvimento de métodos falhos em ambos os casos. Portanto, esta tese se foca em entender por meio de experimentos sistemáticos por que e quando um modelo CNN pré-treinado apresenta desempenho inferior em uma determinada tarefa, a fim de propor soluções adequadas. Na primeira parte de nosso estudo, examinamos a tarefa de reconhecimento de textura e descobrimos que todos os trabalhos anteriores tendiam a se concentrar exclusivamente em conjuntos de dados de textura baseados em categorias, levando à ideia equívoca de que apenas as camadas mais profundas tinham as informações de textura necessárias para resolver essa tarefa. . Mostramos então, propondo estratégias de aprendizagem por transferência multicamadas, que a contribuição de camadas rasas não é trivial e deve ser utilizada em determinadas aplicações. Na segunda parte do nosso estudo, focamos em tarefas desafiadoras de detecção de objetos (detecção de grãos de pólen e localização de estômatos), onde observamos uma situação semelhante à do reconhecimento de texturas. Portanto, em ambos os casos, também aplicamos a análise multicamada para propor detectores rápidos de estágio único que podem lidar com imagens muito grandes com precisão e eficiência.Biblioteca Digitais de Teses e Dissertações da USPBruno, Odemir MartinezCondori, Rayner Harold Montes2022-05-17info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-25072022-165116/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2022-07-25T20:10:41Zoai:teses.usp.br:tde-25072022-165116Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212022-07-25T20:10:41Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
Multi-layer analysis of convolutional neural networks for transfer learning applications Análise multicamada de redes neurais convolucionais para aplicações de transferência de conhecimento |
title |
Multi-layer analysis of convolutional neural networks for transfer learning applications |
spellingShingle |
Multi-layer analysis of convolutional neural networks for transfer learning applications Condori, Rayner Harold Montes Activation maps Classificação de imagens Computer vision Convolutional neural networks Detecção de objetos Image classification Mapas de ativação Object detection Redes neurais convolucionais Transfer learning Transferência de conhecimento Visão por computador |
title_short |
Multi-layer analysis of convolutional neural networks for transfer learning applications |
title_full |
Multi-layer analysis of convolutional neural networks for transfer learning applications |
title_fullStr |
Multi-layer analysis of convolutional neural networks for transfer learning applications |
title_full_unstemmed |
Multi-layer analysis of convolutional neural networks for transfer learning applications |
title_sort |
Multi-layer analysis of convolutional neural networks for transfer learning applications |
author |
Condori, Rayner Harold Montes |
author_facet |
Condori, Rayner Harold Montes |
author_role |
author |
dc.contributor.none.fl_str_mv |
Bruno, Odemir Martinez |
dc.contributor.author.fl_str_mv |
Condori, Rayner Harold Montes |
dc.subject.por.fl_str_mv |
Activation maps Classificação de imagens Computer vision Convolutional neural networks Detecção de objetos Image classification Mapas de ativação Object detection Redes neurais convolucionais Transfer learning Transferência de conhecimento Visão por computador |
topic |
Activation maps Classificação de imagens Computer vision Convolutional neural networks Detecção de objetos Image classification Mapas de ativação Object detection Redes neurais convolucionais Transfer learning Transferência de conhecimento Visão por computador |
description |
Deep learning has become a hot topic in artificial intelligence due to its ability to model complex concepts from simple ones. In this regard, the convolutional neural network (CNN) is one of the most popular kinds of neural networks currently used in computer vision and related areas. In general, the following factors contributed to its popularity. (i) With enough data, most CNNs can be trained from scratch and learn powerful representations that solve the task at stake. (ii) On the other hand, with a limited volume of data, it is possible to also learn powerful representations by adapting the knowledge of a pre-trained CNN model via a transfer learning strategy. As a result, CNNs have advanced the state-of-the-art in many visual recognition tasks, leading to numerous applications in various fields outside of computer science, such as medicine and biology. Nevertheless, many of the best research efforts are focused on improving the state-of-the-art on a few datasets, such as ImageNet for image classification and COCO for object detection. On the other hand, research progress in many other domains is reduced to blindly applying existing approaches or re-inventing everything from scratch, resulting in the development of flawed methods in both cases. Therefore, this thesis focuses on understanding through systematic experiments why and when a pre-trained CNN model underperforms on a given task, to propose suitable solutions. In the first part of our study, we examined the task of texture recognition and discovered that all previous studies tended to focus exclusively on category-based texture datasets, leading to the misconception that only the deepest layers had the texture information needed to solve that task. We then show, by proposing multilayer transfer learning strategies, that the contribution of shallow layers is not trivial and should be used in certain applications. In the second part of our study, we focus on challenging object detection tasks (pollen grain detection and stomata localization), where we observe a situation similar to that of texture recognition. Therefore, in both cases, we also applied multilayer analysis to propose fast single-stage detectors that can handle large images accurately and efficiently. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-05-17 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-25072022-165116/ |
url |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-25072022-165116/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1809091069884760064 |