Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10400.22/18170 |
Resumo: | Facial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset. |
id |
RCAP_a3b6eeba0001614b3458319661fc21df |
---|---|
oai_identifier_str |
oai:recipp.ipp.pt:10400.22/18170 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulationComputer VisionMachine LearningDeep LearningFacial RecognitionInstance SelectionEfficient Data UsageFacial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset.O reconhecimento facial é um dos desafios mais estudados em visão computacional, revelando ser um problema complexo. Isto deve-se, principalmente, à volatilidade das condições de captura de imagem - movimentos entre objetos e a câmara, ou más condições de iluminação - e à grande diversidade de rostos no mundo. Em tarefas de classificação, usando técnicas baseadas em dados, os elementos de treino devem refletir a diversidade de características de cada classe (pessoas). Num cenário ideal, o algoritmo de classificação deverá ser capaz de distinguir corretamente essas classes, de forma a maximizar o seu desempenho. Para tal, ´e necessário que as representações de cada classe, observadas durante o processo de treino, sejam representativas da realidade. A maioria das abordagens de Reconhecimento Facial utiliza uma grande quantidade de dados para desenvolver modelos que conseguem extrair características faciais separáveis e generalizáveis, sendo inviáveis em diversos cenários aplicacionais. E por isso, importante é garantir a variabilidade de representações de cada pessoa. Garantindo esta diversidade, é também possível remover informação redundante e não relevante, diminuir o número de imagens utilizadas no treino destes modelos o que, consequentemente, contribui para a redução dos requisitos computacionais associados a este processo. O trabalho desenvolvido nesta dissertação tem como objetivo analisar o impacto que uma seleção mais reduzida de imagens, focada em diversidade, exerce num problema de Reconhecimento Facial, utilizando uma abordagem de aprendizagem profunda (“Deep Learning”). O motivo principal ´e permitir a utilização destas abordagens em cenários onde os dados são escassos, ou, então, em grande quantidade, mas de fraca qualidade. As questões principais a responder são: Quantas imagens precisamos de utilizar para treinar o modelo? Quanto tempo leva o processo de treino com esta quantidade de dados? Como selecionar as melhores amostras a adicionar no conjunto de dados de treino de forma a maximizar o seu desempenho? A solução proposta utiliza um processo de “Feature Engineering” para selecionar e extrair as características que melhor discriminam a diversidade de faces, o que levará a um aumento da quantidade de informação. Uma das contribuições deste trabalho ´e a identificação de um subgrupo de métricas capazes de representar a diversidade. Num passo seguinte, propomos também dois métodos que as utilizam para garantir um aumento da quantidade de informação num conjunto de dados: uma abordagem baseada em algoritmos de agrupamento, tentando maximizar a distância entre cada elemento selecionado, maximizando assim a diversidade, e uma abordagem, utilizando algoritmos baseados em “Determinantal Point Processes” (um método de modelagem estatística que atribui probabilidades elevadas a subconjuntos diversos usando o produto interno entre os seus vetores de características). As experiências realizadas demonstram a vantagem da utilização da heurística proposta, quando comparada com uma seleção aleatória de amostras, mostrando ser eficaz na redução do tamanho do conjunto de dados de treino, enquanto, em paralelo, mantém um desempenho similar ao obtido com o conjunto completo.Viana, Paula Maria Marques Moura GomesRepositório Científico do Instituto Politécnico do PortoVilaça, Luís Miguel Salgado Nunes2023-11-23T01:31:59Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.22/18170TID:202936970enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-29T01:47:07Zoai:recipp.ipp.pt:10400.22/18170Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T17:37:46.743198Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
title |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
spellingShingle |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation Vilaça, Luís Miguel Salgado Nunes Computer Vision Machine Learning Deep Learning Facial Recognition Instance Selection Efficient Data Usage |
title_short |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
title_full |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
title_fullStr |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
title_full_unstemmed |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
title_sort |
Towards a Scalable Dataset Construction for Facial Recognition: A guided data selection approach for diversity stimulation |
author |
Vilaça, Luís Miguel Salgado Nunes |
author_facet |
Vilaça, Luís Miguel Salgado Nunes |
author_role |
author |
dc.contributor.none.fl_str_mv |
Viana, Paula Maria Marques Moura Gomes Repositório Científico do Instituto Politécnico do Porto |
dc.contributor.author.fl_str_mv |
Vilaça, Luís Miguel Salgado Nunes |
dc.subject.por.fl_str_mv |
Computer Vision Machine Learning Deep Learning Facial Recognition Instance Selection Efficient Data Usage |
topic |
Computer Vision Machine Learning Deep Learning Facial Recognition Instance Selection Efficient Data Usage |
description |
Facial recognition is one of the most studied challenges in computer vision, proving to be a complex problem. This is mainly due to the variation of image capturing conditions, like object-camera relative motion or bad lightning, and the great diversity of faces in the world. For classification purposes using data-based techniques, the training dataset should reflect the diversity of characteristics of every target class (persons). Therefore, in an ideal scenario, a given classification algorithm should be able to distinguish correctly between those classes, thus maximising its performance due to the fairness of representations in the dataset. Most approaches applied to Facial Recognition use large amounts of data to develop models for extracting facial features, making them not feasible for several application scenarios. For this reason, ensuring the variability of the representations for each person is an important requirement. Achieving this goal could also contribute to eliminate redundant and non-relevant information, reducing then the number of images used for training and consequently contributing to reduce the computational requirements. The work developed in this dissertation aims at investigating the impact of selecting a reduced number of images in a Facial Recognition problem, when using a Deep Learning approach. The driving force behind this idea is to enable coping with scenarios where data is scarce or, although of large size, of poor quality. The main questions to answer are: How many training samples do we need to select? How long will it take to train with those training samples? How to select the best samples for the training dataset? The solution proposed uses a feature engineering pipeline to discriminate the diversity of faces by increasing the amount of information. One of our contributions is the identification of a subgroup of metrics capable of representing diversity. As a further step, we also propose two methods that use these metrics to guarantee an increase in the amount of information. A cluster-based approach, that tries to maximise the distance between each selected item, thus maximising diversity, and an approach using Determinantal Point Process, a statistical modelling method that assigns higher probabilities to more diverse subsets using the dot product between its feature vectors are proposed. The experimental tests confirm the gain of the proposed methodology when compared with a standard random selection approach, proving to be effective in reducing the size of the dataset while maintaining a similar performance as the one obtained with the full dataset. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020 2020-01-01T00:00:00Z 2023-11-23T01:31:59Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10400.22/18170 TID:202936970 |
url |
http://hdl.handle.net/10400.22/18170 |
identifier_str_mv |
TID:202936970 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799131467348967424 |