Combining simulated and real images in deep learning

Detalhes bibliográficos
Autor(a) principal: Pedro Xavier Tavares Monteiro Correia de Pinho
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: https://hdl.handle.net/10216/135459
Resumo: To train a deep learning (DL) model, considerable amounts of data are required to generalize to unseen cases successfully. Furthermore, such data is often manually labeled, making its annotation process costly and time-consuming. We propose the use of simulated data, obtained from simulators, as a way to surpass the increasing need for annotated data. Although the use of simulated environments represents an unlimited and cost-effective supply of automatically annotated data, we are still referring to synthetic information. As such, it differs in representation and distribution comparatively to real-world data. The field which addresses the problem of merging the useful features from each of these domains is called domain adaptation (DA), a branch of transfer learning. In this field, several advances have been made, from fine-tuning existing networks to sample-reconstruction approaches. Adversarial DA methods, which make use of Generative Adversarial Networks (GANs), are state-of-the-art and the most widely used. With previous approaches, training data was being sourced from already existent datasets, and the usage of simulators as a means to obtain new observations was an alternative not fully explored. We aim to survey possible DA techniques and apply them to this context of obtaining simulated data with the purpose of training DL models. Stemming from a previous project, aimed to automate quality control at the end of a vehicle's production line, a proof-of-concept will be developed. Previously, a DL model that identified vehicle parts was trained using only data obtained through a simulator. By making use of DA techniques to combine simulated and real images, a new model will be trained to be applied to the real-world more effectively. The model's performance, using both types of data, will be compared to its performance when using exclusively one of the two types. We believe this can be expanded to new areas where, until now, the usage of DL was not feasible due to the constraints imposed by data collection.
id RCAP_fbe28a6dd4d35f4fa0d367c1ad6b450c
oai_identifier_str oai:repositorio-aberto.up.pt:10216/135459
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Combining simulated and real images in deep learningEngenharia electrotécnica, electrónica e informáticaElectrical engineering, Electronic engineering, Information engineeringTo train a deep learning (DL) model, considerable amounts of data are required to generalize to unseen cases successfully. Furthermore, such data is often manually labeled, making its annotation process costly and time-consuming. We propose the use of simulated data, obtained from simulators, as a way to surpass the increasing need for annotated data. Although the use of simulated environments represents an unlimited and cost-effective supply of automatically annotated data, we are still referring to synthetic information. As such, it differs in representation and distribution comparatively to real-world data. The field which addresses the problem of merging the useful features from each of these domains is called domain adaptation (DA), a branch of transfer learning. In this field, several advances have been made, from fine-tuning existing networks to sample-reconstruction approaches. Adversarial DA methods, which make use of Generative Adversarial Networks (GANs), are state-of-the-art and the most widely used. With previous approaches, training data was being sourced from already existent datasets, and the usage of simulators as a means to obtain new observations was an alternative not fully explored. We aim to survey possible DA techniques and apply them to this context of obtaining simulated data with the purpose of training DL models. Stemming from a previous project, aimed to automate quality control at the end of a vehicle's production line, a proof-of-concept will be developed. Previously, a DL model that identified vehicle parts was trained using only data obtained through a simulator. By making use of DA techniques to combine simulated and real images, a new model will be trained to be applied to the real-world more effectively. The model's performance, using both types of data, will be compared to its performance when using exclusively one of the two types. We believe this can be expanded to new areas where, until now, the usage of DL was not feasible due to the constraints imposed by data collection.2021-07-192021-07-19T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/10216/135459TID:202824764engPedro Xavier Tavares Monteiro Correia de Pinhoinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T13:01:26Zoai:repositorio-aberto.up.pt:10216/135459Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T23:31:58.016472Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Combining simulated and real images in deep learning
title Combining simulated and real images in deep learning
spellingShingle Combining simulated and real images in deep learning
Pedro Xavier Tavares Monteiro Correia de Pinho
Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
title_short Combining simulated and real images in deep learning
title_full Combining simulated and real images in deep learning
title_fullStr Combining simulated and real images in deep learning
title_full_unstemmed Combining simulated and real images in deep learning
title_sort Combining simulated and real images in deep learning
author Pedro Xavier Tavares Monteiro Correia de Pinho
author_facet Pedro Xavier Tavares Monteiro Correia de Pinho
author_role author
dc.contributor.author.fl_str_mv Pedro Xavier Tavares Monteiro Correia de Pinho
dc.subject.por.fl_str_mv Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
topic Engenharia electrotécnica, electrónica e informática
Electrical engineering, Electronic engineering, Information engineering
description To train a deep learning (DL) model, considerable amounts of data are required to generalize to unseen cases successfully. Furthermore, such data is often manually labeled, making its annotation process costly and time-consuming. We propose the use of simulated data, obtained from simulators, as a way to surpass the increasing need for annotated data. Although the use of simulated environments represents an unlimited and cost-effective supply of automatically annotated data, we are still referring to synthetic information. As such, it differs in representation and distribution comparatively to real-world data. The field which addresses the problem of merging the useful features from each of these domains is called domain adaptation (DA), a branch of transfer learning. In this field, several advances have been made, from fine-tuning existing networks to sample-reconstruction approaches. Adversarial DA methods, which make use of Generative Adversarial Networks (GANs), are state-of-the-art and the most widely used. With previous approaches, training data was being sourced from already existent datasets, and the usage of simulators as a means to obtain new observations was an alternative not fully explored. We aim to survey possible DA techniques and apply them to this context of obtaining simulated data with the purpose of training DL models. Stemming from a previous project, aimed to automate quality control at the end of a vehicle's production line, a proof-of-concept will be developed. Previously, a DL model that identified vehicle parts was trained using only data obtained through a simulator. By making use of DA techniques to combine simulated and real images, a new model will be trained to be applied to the real-world more effectively. The model's performance, using both types of data, will be compared to its performance when using exclusively one of the two types. We believe this can be expanded to new areas where, until now, the usage of DL was not feasible due to the constraints imposed by data collection.
publishDate 2021
dc.date.none.fl_str_mv 2021-07-19
2021-07-19T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10216/135459
TID:202824764
url https://hdl.handle.net/10216/135459
identifier_str_mv TID:202824764
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799135630057275392