A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations

Detalhes bibliográficos
Autor(a) principal: Gonçalves, Guilherme Marques
Data de Publicação: 2020
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10773/30759
Resumo: Advances in deep convolutional neural networks and efficient parallel processing are showing great promise when applied to image classification, object detection, image restoration and image segmentation. However, deep models require large amounts of annotated training data, which are not always accessible. In this context, data augmentation has appeared as an effective technique by which the original dataset is expanded to cope with imbalanced datasets, avoid overfitting, and increase classification performance. This dissertation aims to compare the effectiveness of data augmentation techniques when applied to image classification problems, focusing on basic image manipulations and generative modelling. On the one hand, basic image manipulations include classical transformations of the original samples such as rotations, translations, flips and crops. On the other hand, generative adversarial networks (GANs) are used to synthesize artificial samples from the original dataset through adversarial training. This comparative study considers two distinct classification problems - handwritten digits recognition and melanoma skin cancer diagnosis - that are addressed using convolutional neural network models. A baseline multiclass classifier was developed from scratch for the handwritten digit recognition using the MNIST dataset. The binary melanoma classification uses pre-trained models, namely the VGG16 and the DenseNet201, on the ISIC2019 dataset. For generating handwritten digits, GAN-based data augmentation is supported by Deep Convolutional GANs (DCGANs) and Conditional GANs (cGANs). More advanced architectures like Progressive GANs (PGANs) and Style- GANs are used for synthesizing melanoma dermoscopic images. The results obtained demonstrate that basic image manipulations perform remarkably well in classification tasks. Further, GAN-based data augmentation does not yet compete with classical techniques, especially in problems that require high quality and realistic images, as is the case with medical applications. Nevertheless, it is shown that the StyleGAN2-Ada helps to improve the balanced accuracy by 2.1% when compared with the CNN model without any kind of augmentation. The combination of classical and synthetic augmentations may be the best option in the near future.
id RCAP_ba390731a24bfad7277b28df2816284d
oai_identifier_str oai:ria.ua.pt:10773/30759
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling A comparative study of data augmentation techniques for image classification: generative models vs. classical transformationsGenerative modelsAdversarial networksDeep learningData augmentationTransfer learningImage synthesisSkin lesion classificationAdvances in deep convolutional neural networks and efficient parallel processing are showing great promise when applied to image classification, object detection, image restoration and image segmentation. However, deep models require large amounts of annotated training data, which are not always accessible. In this context, data augmentation has appeared as an effective technique by which the original dataset is expanded to cope with imbalanced datasets, avoid overfitting, and increase classification performance. This dissertation aims to compare the effectiveness of data augmentation techniques when applied to image classification problems, focusing on basic image manipulations and generative modelling. On the one hand, basic image manipulations include classical transformations of the original samples such as rotations, translations, flips and crops. On the other hand, generative adversarial networks (GANs) are used to synthesize artificial samples from the original dataset through adversarial training. This comparative study considers two distinct classification problems - handwritten digits recognition and melanoma skin cancer diagnosis - that are addressed using convolutional neural network models. A baseline multiclass classifier was developed from scratch for the handwritten digit recognition using the MNIST dataset. The binary melanoma classification uses pre-trained models, namely the VGG16 and the DenseNet201, on the ISIC2019 dataset. For generating handwritten digits, GAN-based data augmentation is supported by Deep Convolutional GANs (DCGANs) and Conditional GANs (cGANs). More advanced architectures like Progressive GANs (PGANs) and Style- GANs are used for synthesizing melanoma dermoscopic images. The results obtained demonstrate that basic image manipulations perform remarkably well in classification tasks. Further, GAN-based data augmentation does not yet compete with classical techniques, especially in problems that require high quality and realistic images, as is the case with medical applications. Nevertheless, it is shown that the StyleGAN2-Ada helps to improve the balanced accuracy by 2.1% when compared with the CNN model without any kind of augmentation. The combination of classical and synthetic augmentations may be the best option in the near future.Os avanços em redes neurais convolucionais profundas e o processamento paralelo eficiente têm vindo a mostrar grande potencial quando aplicados à deteção de objetos e à classificação, segmentação e restauro de imagem. Contudo, os modelos profundos requerem grandes quantidades de dados de treino anotados, que nem sempre existem. Neste contexto, o aumento de dados surgiu como uma técnica eficaz, através da qual o conjunto de dados original é expandido para lidar com conjuntos de dados desequilibrados, evitar sobreajustamento (overfitting), e melhorar o desempenho da classificação. Esta dissertação visa comparar a eficácia das técnicas de aumento de dados quando aplicadas a problemas de classificação de imagens, concentrando-se em manipulações básicas de imagem e modelos generativos. Por um lado, as manipulações básicas de imagem incluem transformações clássicas das amostras originais, tais como rotações, translações, inversões e cortes. Por outro, as redes adversárias generativas (GANs) são utilizadas para sintetizar amostras artificiais do conjunto de dados original através de treino adversário. Este estudo comparativo considera dois problemas de classificação distintos - reconhecimento de dígitos manuscritos e diagnostico de melanoma de cancro da pele - que são tratados utilizando modelos de redes neurais convolucionais. Para o reconhecimento dos dígitos manuscritos foi desenvolvido um classificador de raiz multiclasse, utilizando o conjunto de dados MNIST. Para a classificação binária de melanomas foram utilizados modelos pré-treinados, nomeadamente o VGG16 e o DenseNet201, com o conjunto de dados ISIC2019. Para a geração de dígitos manuscritos, o aumento de dados baseado em GAN é apoiado por GANs convolucionais Profundos (DCGANs) e GANs Condicionais (cGANs).Para a síntese de imagens termoscópicas de melanomas foram utilizadas arquiteturas mais avançadas, como GANs Progressivos (PGANs) e StyleGANs. Os resultados obtidos demonstram que as transformações simples de imagem têm um desempenho notável em tarefas de classificação e que o aumento de dados baseado em GAN ainda não compete com as técnicas clássicas, especialmente em problemas que requerem imagens de alta qualidade e realistas, como é o caso das aplicações médicas. No entanto, demonstra-se que a StyleGAN2-Ada ajuda a melhorar a precisão equilibrada em 2,1% quando comparado com o modelo CNN sem qualquer tipo de aumento. A combinação de aumentos clássicos e sintéticos poderá vir a ser a melhor opção num futuro próximo.2021-03-04T16:01:17Z2020-12-21T00:00:00Z2020-12-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10773/30759engGonçalves, Guilherme Marquesinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-02-22T11:59:25Zoai:ria.ua.pt:10773/30759Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:02:46.106841Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
title A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
spellingShingle A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
Gonçalves, Guilherme Marques
Generative models
Adversarial networks
Deep learning
Data augmentation
Transfer learning
Image synthesis
Skin lesion classification
title_short A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
title_full A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
title_fullStr A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
title_full_unstemmed A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
title_sort A comparative study of data augmentation techniques for image classification: generative models vs. classical transformations
author Gonçalves, Guilherme Marques
author_facet Gonçalves, Guilherme Marques
author_role author
dc.contributor.author.fl_str_mv Gonçalves, Guilherme Marques
dc.subject.por.fl_str_mv Generative models
Adversarial networks
Deep learning
Data augmentation
Transfer learning
Image synthesis
Skin lesion classification
topic Generative models
Adversarial networks
Deep learning
Data augmentation
Transfer learning
Image synthesis
Skin lesion classification
description Advances in deep convolutional neural networks and efficient parallel processing are showing great promise when applied to image classification, object detection, image restoration and image segmentation. However, deep models require large amounts of annotated training data, which are not always accessible. In this context, data augmentation has appeared as an effective technique by which the original dataset is expanded to cope with imbalanced datasets, avoid overfitting, and increase classification performance. This dissertation aims to compare the effectiveness of data augmentation techniques when applied to image classification problems, focusing on basic image manipulations and generative modelling. On the one hand, basic image manipulations include classical transformations of the original samples such as rotations, translations, flips and crops. On the other hand, generative adversarial networks (GANs) are used to synthesize artificial samples from the original dataset through adversarial training. This comparative study considers two distinct classification problems - handwritten digits recognition and melanoma skin cancer diagnosis - that are addressed using convolutional neural network models. A baseline multiclass classifier was developed from scratch for the handwritten digit recognition using the MNIST dataset. The binary melanoma classification uses pre-trained models, namely the VGG16 and the DenseNet201, on the ISIC2019 dataset. For generating handwritten digits, GAN-based data augmentation is supported by Deep Convolutional GANs (DCGANs) and Conditional GANs (cGANs). More advanced architectures like Progressive GANs (PGANs) and Style- GANs are used for synthesizing melanoma dermoscopic images. The results obtained demonstrate that basic image manipulations perform remarkably well in classification tasks. Further, GAN-based data augmentation does not yet compete with classical techniques, especially in problems that require high quality and realistic images, as is the case with medical applications. Nevertheless, it is shown that the StyleGAN2-Ada helps to improve the balanced accuracy by 2.1% when compared with the CNN model without any kind of augmentation. The combination of classical and synthetic augmentations may be the best option in the near future.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-21T00:00:00Z
2020-12-21
2021-03-04T16:01:17Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10773/30759
url http://hdl.handle.net/10773/30759
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799137683089391616