Integrating Vision and Language for Automatic Face Descriptions

Rodrigues, Diogo Manuel de Castro

Integrating Vision and Language for Automatic Face Descriptions

Detalhes bibliográficos
Autor(a) principal:	Rodrigues, Diogo Manuel de Castro
Data de Publicação:	2018
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10316/86752
Resumo:	Dissertação de Mestrado Integrado em Engenharia Electrotécnica e de Computadores apresentada à Faculdade de Ciências e Tecnologia

Metadados do item

id	RCAP_5b7fb8261c0092e8503de0b4927dc666
oai_identifier_str	oai:estudogeral.uc.pt:10316/86752
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Integrating Vision and Language for Automatic Face DescriptionsIntegrando Visão e Linguagem para Descrições Faciais AutomáticasInteligência ArtificialAprendizagem ProfundaRede Neuronal ConvolucionalRede Adversarial GenerativaProcessamento de Linguagem NaturalArtificial IntelligenceDeep LearningConvolutional Neural NetworkGenerative Adversarial NetworkNatural Language ProcessingDissertação de Mestrado Integrado em Engenharia Electrotécnica e de Computadores apresentada à Faculdade de Ciências e TecnologiaNesta dissertação, para criar um exemplo único de um sistema de face para texto e texto para face foi integrado visão por computador e processamento de linguagem natural. O propósito é fornecer uma solução que permita ajudar os seres humanos a realizar funções com maior qualidade e de forma mais rápida. Assim sendo pretende-se criar um sistema que possa ser usado, por exemplo, para descrever rostos para pessoas com deficiência visual ou para gerar rostos a partir de descrições para investigações criminais. No entanto trata-se apenas de uma versão preliminar, na medida em que o curto tempo disponível para a sua realização não permitiu alcançar a ambiciosa proposta. De forma a atingir este objectivo, foi criado um sistema com a capacidade de descrever textualmente imagens faciais e por outro lado, gerar automaticamente imagens faciais a partir de descrições textuais. O sistema é dividido em duas partes, a primeira tem como função prever atributos das imagens faciais através de uma rede neuronal convolucional. Estes são utilizados como base para o modelo de geração de linguagem natural, gerando descrições textuais numa metodologia baseada em regras. A segunda parte, usa uma técnica simples de extração de palavras chave para analisar o texto e identificar os atributos nessa descrição. Seguidamente, o sistema usa uma rede generativa adversarial para gerar uma imagem facial com o conjunto das características desejadas. Os atributos são usados como base no nosso método, uma vez que representam um identificador dominante que transmite características sobre um rosto com eficácia.Os resultados demonstraram, mais uma vez, que os métodos CNN e GAN são atualmente as melhores opções para, tarefas de reconhecimento e geração de imagens, respectivamente. Esta conclusão destá assente nos resultados convincentes. Por outro lado, os métodos de processamento de linguagem natural apesar de terem funcionado bem, de acordo com os objectivos, os seus resultados são menos notáveis, especialmente o modelo de geração de linguagem natural. Este trabalho propõe uma solução fiável e funcional para resolver este sistema complexo, no entanto é uma área que merece uma extensa investigação e desenvolvimento.In this dissertation, computer vision and Natural Language Processing (NLP) are integrated to create a unique example of a face-to-text and text-to-face system. Its intention is to provide a solution that can help humans to perform their jobs with better quality and with a quick response. The aim is to create a system that can be used, for example, to describe faces for visually impaired people or to generate faces from descriptions for criminal investigations. However, this is a preliminary version as it is an ambitious goal to be achieved during the time available for its realization.To accomplish this motivation, a system was created with the capability of describing, textually, facial images, along with the ability to automatically generate face images from text descriptions. The system is divided into two sub-systems. The first part predicts attributes from the face images through a Convolutional Neural Network (CNN) method that are used, further, as a base to the Natural Language Generation (NLG) model. The descriptions are generated on a rule-based methodology. The second part of the system uses a simple keyword extraction technique to analyze the text and identify the attributes on that description. After that, it uses a conditional Generative Adversarial Network (GAN) to generate a facial image with a specific set of desired attributes. The reason why attributes are used as a base on the method is because they are a dominant identifier that can efficiently transmit characteristic about a face. The results demonstrate, once again, that either CNN and GAN methods are presently the best options for recognition and generation tasks, respectively. This conclusion is due to their convincing results. On the other hand, the NLP methods worked well for their purposes. However, its results are less remarkable, especially the NLG model. This work proposes a reliable and functional solution for solving this complex system. Nevertheless, this area needs an extensive investigation and development.2018-09-24info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesishttp://hdl.handle.net/10316/86752http://hdl.handle.net/10316/86752TID:202219380engRodrigues, Diogo Manuel de Castroinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2021-07-22T10:04:22Zoai:estudogeral.uc.pt:10316/86752Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:07:52.089183Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Integrating Vision and Language for Automatic Face Descriptions Integrando Visão e Linguagem para Descrições Faciais Automáticas
title	Integrating Vision and Language for Automatic Face Descriptions
spellingShingle	Integrating Vision and Language for Automatic Face Descriptions Rodrigues, Diogo Manuel de Castro Inteligência Artificial Aprendizagem Profunda Rede Neuronal Convolucional Rede Adversarial Generativa Processamento de Linguagem Natural Artificial Intelligence Deep Learning Convolutional Neural Network Generative Adversarial Network Natural Language Processing
title_short	Integrating Vision and Language for Automatic Face Descriptions
title_full	Integrating Vision and Language for Automatic Face Descriptions
title_fullStr	Integrating Vision and Language for Automatic Face Descriptions
title_full_unstemmed	Integrating Vision and Language for Automatic Face Descriptions
title_sort	Integrating Vision and Language for Automatic Face Descriptions
author	Rodrigues, Diogo Manuel de Castro
author_facet	Rodrigues, Diogo Manuel de Castro
author_role	author
dc.contributor.author.fl_str_mv	Rodrigues, Diogo Manuel de Castro
dc.subject.por.fl_str_mv	Inteligência Artificial Aprendizagem Profunda Rede Neuronal Convolucional Rede Adversarial Generativa Processamento de Linguagem Natural Artificial Intelligence Deep Learning Convolutional Neural Network Generative Adversarial Network Natural Language Processing
topic	Inteligência Artificial Aprendizagem Profunda Rede Neuronal Convolucional Rede Adversarial Generativa Processamento de Linguagem Natural Artificial Intelligence Deep Learning Convolutional Neural Network Generative Adversarial Network Natural Language Processing
description	Dissertação de Mestrado Integrado em Engenharia Electrotécnica e de Computadores apresentada à Faculdade de Ciências e Tecnologia
publishDate	2018
dc.date.none.fl_str_mv	2018-09-24
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10316/86752 http://hdl.handle.net/10316/86752 TID:202219380
url	http://hdl.handle.net/10316/86752
identifier_str_mv	TID:202219380
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799133969154834432

Integrating Vision and Language for Automatic Face Descriptions

Registros relacionados