A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability

Alpalhão, Nuno Tiago Falcão

A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability

Detalhes bibliográficos
Autor(a) principal:	Alpalhão, Nuno Tiago Falcão
Data de Publicação:	2021
Tipo de documento:	Dissertação
Idioma:	eng
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10362/112034
Resumo:	Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics

Metadados do item

id	RCAP_1a401e2367990abeae8701081154c669
oai_identifier_str	oai:run.unl.pt:10362/112034
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generabilityGeneralizationMachine LearningLoss FunctionMetricNoiseGeneralizaçãoAprendizagem AutomáticaFunção de PerdaMétricaRuidoDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIntuitively Generalization in Machine Learning can be understood as a models ability to apply its trained or acquired knowledge to a previously unseen scenario. In the recent years there has been an exponential growth in machine learning models both efficiency and accuracy, yet the current research is still trying to understand and trust how well models can perform on previously unseen data. For this thesis we propose a study of machine learning’s theoretical background to further expand the notion of generalization and it’s limitation’s, enabling us to derive its commonly accepted approximation, definitions that we will use to present a new generalization metric or score more consistent in detecting and providing understanding of the occurrence of generalization. Additionally a new loss function will be presented in order to mitigate generalization error inherit to a noisy sample, where extensive tests suggest that our loss function has a higher rate of convergence while producing statistically similar or even better results when compared with classical loss functions.Intuitivamente generalização em Aprendizagem Automática pode ser entendida como a capacidade de um modelo em aplicar o seu conhecimento treinado ou adquirido a um cenário nunca antes visto. Nos últimos anos, tem existido um crescimento exponencial tanto na eficiência quanto na precisão dos modelos de Aprendizagem Automática, no entanto a pesquisa atual ainda se debate bastante em como entender e confiar na capacidade de execução dos modelos em dados nunca antes vistos. Para esta tese, propomos um estudo dos fundamentos teóricos da Aprendizagem Automática para expandir ainda mais a noção de generalização e suas limitações, permitindo-nos derivar sua aproximação comummente aceita. Definições estas que usaremos para apresentar uma nova métrica de generalização mais consistente na detecção da ocorrência ou não de generalização. Adicionalmente, uma nova função de perda será apresentada a fim de mitigar o erro de generalização herdado de uma amostra ruidosa, onde testes extensivos sugerem que nossa função de perda tem uma taxa de convergência significantemente mais alta produzindo resultados estatisticamente semelhantes ou até melhores quando comparada com as funções de perda clássicas.Vanneschi, LeonardoRUNAlpalhão, Nuno Tiago Falcão2021-02-18T15:27:40Z2021-01-132021-01-13T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/112034TID:202642135enginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-05-22T17:50:38Zoai:run.unl.pt:10362/112034Portal AgregadorONGhttps://www.rcaap.pt/oai/openairemluisa.alvim@gmail.comopendoar:71602024-05-22T17:50:38Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
title	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
spellingShingle	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability Alpalhão, Nuno Tiago Falcão Generalization Machine Learning Loss Function Metric Noise Generalização Aprendizagem Automática Função de Perda Métrica Ruido
title_short	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
title_full	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
title_fullStr	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
title_full_unstemmed	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
title_sort	A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability
author	Alpalhão, Nuno Tiago Falcão
author_facet	Alpalhão, Nuno Tiago Falcão
author_role	author
dc.contributor.none.fl_str_mv	Vanneschi, Leonardo RUN
dc.contributor.author.fl_str_mv	Alpalhão, Nuno Tiago Falcão
dc.subject.por.fl_str_mv	Generalization Machine Learning Loss Function Metric Noise Generalização Aprendizagem Automática Função de Perda Métrica Ruido
topic	Generalization Machine Learning Loss Function Metric Noise Generalização Aprendizagem Automática Função de Perda Métrica Ruido
description	Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
publishDate	2021
dc.date.none.fl_str_mv	2021-02-18T15:27:40Z 2021-01-13 2021-01-13T00:00:00Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10362/112034 TID:202642135
url	http://hdl.handle.net/10362/112034
identifier_str_mv	TID:202642135
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv	mluisa.alvim@gmail.com
_version_	1817545781067907072

A study of generalization in regression: proposal of a new metric and loss function to better understand and improve generability

Registros relacionados