Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation

Vilaça, Luís Miguel Salgado Nunes

Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation

Detalhes bibliográficos
Autor(a) principal:	Vilaça, Luís Miguel Salgado Nunes
Data de Publicação:	2020
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10400.22/18170
Resumo:	Facial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset.

Metadados do item

id	RCAP_a3b6eeba0001614b3458319661fc21df
oai_identifier_str	oai:recipp.ipp.pt:10400.22/18170
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulationComputer VisionMachine LearningDeep LearningFacial RecognitionInstance SelectionEfficient Data UsageFacial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset.O reconhecimento facial é um dos desafios mais estudados em visão computacional, revelando ser um problema complexo. Isto deve-se, principalmente, à volatilidade das condições de captura de imagem - movimentos entre objetos e a câmara, ou más condições de iluminação - e à grande diversidade de rostos no mundo. Em tarefas de classificação, usando técnicas baseadas em dados, os elementos de treino devem refletir a diversidade de características de cada classe (pessoas). Num cenário ideal, o algoritmo de classificação deverá ser capaz de distinguir corretamente essas classes, de forma a maximizar o seu desempenho. Para tal, ´e necessário que as representações de cada classe, observadas durante o processo de treino, sejam representativas da realidade. A maioria das abordagens de Reconhecimento Facial utiliza uma grande quantidade de dados para desenvolver modelos que conseguem extrair características faciais separáveis e generalizáveis, sendo inviáveis em diversos cenários aplicacionais. E por isso, importante é garantir a variabilidade de representações de cada pessoa. Garantindo esta diversidade, é também possível remover informação redundante e não relevante, diminuir o número de imagens utilizadas no treino destes modelos o que, consequentemente, contribui para a redução dos requisitos computacionais associados a este processo. O trabalho desenvolvido nesta dissertação tem como objetivo analisar o impacto que uma seleção mais reduzida de imagens, focada em diversidade, exerce num problema de Reconhecimento Facial, utilizando uma abordagem de aprendizagem profunda (“Deep Learning”). O motivo principal ´e permitir a utilização destas abordagens em cenários onde os dados são escassos, ou, então, em grande quantidade, mas de fraca qualidade. As questões principais a responder são: Quantas imagens precisamos de utilizar para treinar o modelo? Quanto tempo leva o processo de treino com esta quantidade de dados? Como selecionar as melhores amostras a adicionar no conjunto de dados de treino de forma a maximizar o seu desempenho? A solução proposta utiliza um processo de “Feature Engineering” para selecionar e extrair as características que melhor discriminam a diversidade de faces, o que levará a um aumento da quantidade de informação. Uma das contribuições deste trabalho ´e a identificação de um subgrupo de métricas capazes de representar a diversidade. Num passo seguinte, propomos também dois métodos que as utilizam para garantir um aumento da quantidade de informação num conjunto de dados: uma abordagem baseada em algoritmos de agrupamento, tentando maximizar a distância entre cada elemento selecionado, maximizando assim a diversidade, e uma abordagem, utilizando algoritmos baseados em “Determinantal Point Processes” (um método de modelagem estatística que atribui probabilidades elevadas a subconjuntos diversos usando o produto interno entre os seus vetores de características). As experiências realizadas demonstram a vantagem da utilização da heurística proposta, quando comparada com uma seleção aleatória de amostras, mostrando ser eficaz na redução do tamanho do conjunto de dados de treino, enquanto, em paralelo, mantém um desempenho similar ao obtido com o conjunto completo.Viana, Paula Maria Marques Moura GomesRepositório Científico do Instituto Politécnico do PortoVilaça, Luís Miguel Salgado Nunes2023-11-23T01:31:59Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/18170TID:202936970enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T01:47:07Zoai:recipp.ipp.pt:10400.22/18170Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:37:46.743198Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
title	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
spellingShingle	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation Vilaça, Luís Miguel Salgado Nunes Computer Vision Machine Learning Deep Learning Facial Recognition Instance Selection Efficient Data Usage
title_short	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
title_full	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
title_fullStr	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
title_full_unstemmed	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
title_sort	Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
author	Vilaça, Luís Miguel Salgado Nunes
author_facet	Vilaça, Luís Miguel Salgado Nunes
author_role	author
dc.contributor.none.fl_str_mv	Viana, Paula Maria Marques Moura Gomes Repositório Científico do Instituto Politécnico do Porto
dc.contributor.author.fl_str_mv	Vilaça, Luís Miguel Salgado Nunes
dc.subject.por.fl_str_mv	Computer Vision Machine Learning Deep Learning Facial Recognition Instance Selection Efficient Data Usage
topic	Computer Vision Machine Learning Deep Learning Facial Recognition Instance Selection Efficient Data Usage
description	Facial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset.
publishDate	2020
dc.date.none.fl_str_mv	2020 2020-01-01T00:00:00Z 2023-11-23T01:31:59Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10400.22/18170 TID:202936970
url	http://hdl.handle.net/10400.22/18170
identifier_str_mv	TID:202936970
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799131467348967424

Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation

Registros relacionados