Influential diagnostics for location parameter within GAMLSS

Detalhes bibliográficos
Autor(a) principal: SILVA, Lucas Araújo da
Data de Publicação: 2021
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositório Institucional da UFPE
Texto Completo: https://repositorio.ufpe.br/handle/123456789/40169
Resumo: Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology.
id UFPE_acad6d3675b5d86a9c91e9c390ae8624
oai_identifier_str oai:repositorio.ufpe.br:123456789/40169
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str 2221
spelling SILVA, Lucas Araújo dahttp://lattes.cnpq.br/7987821215029063http://lattes.cnpq.br/5519064508209103DE BASTIANI, Fernanda2021-05-25T13:11:48Z2021-05-25T13:11:48Z2021-02-18SILVA, Lucas Araújo da. Influential diagnostics for location parameter within GAMLSS. 2021. Dissertação (Mestrado em Estatística) - Universidade Federal de Pernambuco, Recife, 2021.https://repositorio.ufpe.br/handle/123456789/40169Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology.CAPESModelar a relação funcional entre uma váriável resposta e um conjunto de variáveis ex plicativas é o cerne dos problemas de regressão em estatística. Diversos estudos tem propostos diferentes modelos. Mais recentemente os modelos aditivos generalizados para locação escala e forma (GAMLSS) tem ganhado atenção por generalizar outros modelos já populares como o modelo linear, os modelos lineares generalizados, modelos semiparamétricos e os modelos aditivos generalizados, e permitir qualquer distribuição paramétrica para modelar a variável resposta. Além disso, todos os parâmetros da distribuição podem ser modelados com funções lineares, não lineares ou funções de suavização das variáveis explicativas. Várias ferramentas de diagnósticos de influência tem sido propostas na literatura, e este trabalho mostra algumas dessas ferramentas e propõe técnicas para detectar possíveis observações influentes na classe de modelos GAMLSS. Este trabalho considera diversas medidas de influência como: a distân cia de Cook generalizada, o afastamento de verossimilhanças, a medida de Peña ajustada, diferenças do critério de informação de Akaike generalizada e a medida de Kim para dados simulados e aplicações. É proposto ainda algoritmos para obter os valores de referência destas medidas utilizando bootstrap, adaptando para as outras medidas o procedimento sugerido por Kim et al. (2002). O estudo ainda limita-se a situações que se é modelado o parâmetro de locação (em geral a média) da variável resposta, incluindo ou não termos aditivos de suaviza ção, neste caso utilizou-se splines penalizados univariados como suavizador, já que a medida de Peña e de Kim necessitam do cálculo da matriz de suavização que varia de acordo com a covariável suavizada e o suavizador em questão. Para os estudos de simulação, foram consid erados diversos cenários com algumas distribuições relevantes e diversos tamanhos amostrais, considerando distribuições tanto de natureza contínua quanto discretas. Análise de dados reais ilustram a metodologia abordada.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em EstatisticaUFPEBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessEstatística AplicadaBootstrapDistância de CookInfluential diagnostics for location parameter within GAMLSSinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Lucas Araújo da Silva.pdfDISSERTAÇÃO Lucas Araújo da Silva.pdfapplication/pdf1065207https://repositorio.ufpe.br/bitstream/123456789/40169/1/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf9edbfc527815bb66f86cc54e5d14118eMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82310https://repositorio.ufpe.br/bitstream/123456789/40169/3/license.txtbd573a5ca8288eb7272482765f819534MD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/40169/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52TEXTDISSERTAÇÃO Lucas Araújo da Silva.pdf.txtDISSERTAÇÃO Lucas Araújo da Silva.pdf.txtExtracted texttext/plain110610https://repositorio.ufpe.br/bitstream/123456789/40169/4/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.txt1879dc0fb9f8a50ddae221d09028d452MD54THUMBNAILDISSERTAÇÃO Lucas Araújo da Silva.pdf.jpgDISSERTAÇÃO Lucas Araújo da Silva.pdf.jpgGenerated Thumbnailimage/jpeg1175https://repositorio.ufpe.br/bitstream/123456789/40169/5/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.jpg1072daf3d37ec038cf50965514f1322dMD55123456789/401692021-05-26 02:16:16.02oai:repositorio.ufpe.br:123456789/40169TGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKClRvZG8gZGVwb3NpdGFudGUgZGUgbWF0ZXJpYWwgbm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgKFJJKSBkZXZlIGNvbmNlZGVyLCDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIChVRlBFKSwgdW1hIExpY2Vuw6dhIGRlIERpc3RyaWJ1acOnw6NvIE7Do28gRXhjbHVzaXZhIHBhcmEgbWFudGVyIGUgdG9ybmFyIGFjZXNzw612ZWlzIG9zIHNldXMgZG9jdW1lbnRvcywgZW0gZm9ybWF0byBkaWdpdGFsLCBuZXN0ZSByZXBvc2l0w7NyaW8uCgpDb20gYSBjb25jZXNzw6NvIGRlc3RhIGxpY2Vuw6dhIG7Do28gZXhjbHVzaXZhLCBvIGRlcG9zaXRhbnRlIG1hbnTDqW0gdG9kb3Mgb3MgZGlyZWl0b3MgZGUgYXV0b3IuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwoKTGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKCkFvIGNvbmNvcmRhciBjb20gZXN0YSBsaWNlbsOnYSBlIGFjZWl0w6EtbGEsIHZvY8OqIChhdXRvciBvdSBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMpOgoKYSkgRGVjbGFyYSBxdWUgY29uaGVjZSBhIHBvbMOtdGljYSBkZSBjb3B5cmlnaHQgZGEgZWRpdG9yYSBkbyBzZXUgZG9jdW1lbnRvOwpiKSBEZWNsYXJhIHF1ZSBjb25oZWNlIGUgYWNlaXRhIGFzIERpcmV0cml6ZXMgcGFyYSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGUEU7CmMpIENvbmNlZGUgw6AgVUZQRSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZGUgYXJxdWl2YXIsIHJlcHJvZHV6aXIsIGNvbnZlcnRlciAoY29tbyBkZWZpbmlkbyBhIHNlZ3VpciksIGNvbXVuaWNhciBlL291IGRpc3RyaWJ1aXIsIG5vIFJJLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgcG9yIG91dHJvIG1laW87CmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgVUZQRSBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgcGFyYSBxdWFscXVlciBmb3JtYXRvIGRlIGZpY2hlaXJvLCBtZWlvIG91IHN1cG9ydGUsIHBhcmEgZWZlaXRvcyBkZSBzZWd1cmFuw6dhLCBwcmVzZXJ2YcOnw6NvIChiYWNrdXApIGUgYWNlc3NvOwplKSBEZWNsYXJhIHF1ZSBvIGRvY3VtZW50byBzdWJtZXRpZG8gw6kgbyBzZXUgdHJhYmFsaG8gb3JpZ2luYWwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBhIGVudHJlZ2EgZG8gZG9jdW1lbnRvIG7Do28gaW5mcmluZ2Ugb3MgZGlyZWl0b3MgZGUgb3V0cmEgcGVzc29hIG91IGVudGlkYWRlOwpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlCmF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIGlycmVzdHJpdGEgZG8gcmVzcGVjdGl2byBkZXRlbnRvciBkZXNzZXMgZGlyZWl0b3MgcGFyYSBjZWRlciDDoApVRlBFIG9zIGRpcmVpdG9zIHJlcXVlcmlkb3MgcG9yIGVzdGEgTGljZW7Dp2EgZSBhdXRvcml6YXIgYSB1bml2ZXJzaWRhZGUgYSB1dGlsaXrDoS1sb3MgbGVnYWxtZW50ZS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBlc3NlIG1hdGVyaWFsIGN1am9zIGRpcmVpdG9zIHPDo28gZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZTsKZykgU2UgbyBkb2N1bWVudG8gZW50cmVndWUgw6kgYmFzZWFkbyBlbSB0cmFiYWxobyBmaW5hbmNpYWRvIG91IGFwb2lhZG8gcG9yIG91dHJhIGluc3RpdHVpw6fDo28gcXVlIG7Do28gYSBVRlBFLCBkZWNsYXJhIHF1ZSBjdW1wcml1IHF1YWlzcXVlciBvYnJpZ2HDp8O1ZXMgZXhpZ2lkYXMgcGVsbyByZXNwZWN0aXZvIGNvbnRyYXRvIG91IGFjb3Jkby4KCkEgVUZQRSBpZGVudGlmaWNhcsOhIGNsYXJhbWVudGUgbyhzKSBub21lKHMpIGRvKHMpIGF1dG9yIChlcykgZG9zIGRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212021-05-26T05:16:16Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.pt_BR.fl_str_mv Influential diagnostics for location parameter within GAMLSS
title Influential diagnostics for location parameter within GAMLSS
spellingShingle Influential diagnostics for location parameter within GAMLSS
SILVA, Lucas Araújo da
Estatística Aplicada
Bootstrap
Distância de Cook
title_short Influential diagnostics for location parameter within GAMLSS
title_full Influential diagnostics for location parameter within GAMLSS
title_fullStr Influential diagnostics for location parameter within GAMLSS
title_full_unstemmed Influential diagnostics for location parameter within GAMLSS
title_sort Influential diagnostics for location parameter within GAMLSS
author SILVA, Lucas Araújo da
author_facet SILVA, Lucas Araújo da
author_role author
dc.contributor.authorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/7987821215029063
dc.contributor.advisorLattes.pt_BR.fl_str_mv http://lattes.cnpq.br/5519064508209103
dc.contributor.author.fl_str_mv SILVA, Lucas Araújo da
dc.contributor.advisor1.fl_str_mv DE BASTIANI, Fernanda
contributor_str_mv DE BASTIANI, Fernanda
dc.subject.por.fl_str_mv Estatística Aplicada
Bootstrap
Distância de Cook
topic Estatística Aplicada
Bootstrap
Distância de Cook
description Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology.
publishDate 2021
dc.date.accessioned.fl_str_mv 2021-05-25T13:11:48Z
dc.date.available.fl_str_mv 2021-05-25T13:11:48Z
dc.date.issued.fl_str_mv 2021-02-18
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv SILVA, Lucas Araújo da. Influential diagnostics for location parameter within GAMLSS. 2021. Dissertação (Mestrado em Estatística) - Universidade Federal de Pernambuco, Recife, 2021.
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/40169
identifier_str_mv SILVA, Lucas Araújo da. Influential diagnostics for location parameter within GAMLSS. 2021. Dissertação (Mestrado em Estatística) - Universidade Federal de Pernambuco, Recife, 2021.
url https://repositorio.ufpe.br/handle/123456789/40169
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/embargoedAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv embargoedAccess
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.publisher.program.fl_str_mv Programa de Pos Graduacao em Estatistica
dc.publisher.initials.fl_str_mv UFPE
dc.publisher.country.fl_str_mv Brasil
publisher.none.fl_str_mv Universidade Federal de Pernambuco
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
bitstream.url.fl_str_mv https://repositorio.ufpe.br/bitstream/123456789/40169/1/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf
https://repositorio.ufpe.br/bitstream/123456789/40169/3/license.txt
https://repositorio.ufpe.br/bitstream/123456789/40169/2/license_rdf
https://repositorio.ufpe.br/bitstream/123456789/40169/4/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.txt
https://repositorio.ufpe.br/bitstream/123456789/40169/5/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.jpg
bitstream.checksum.fl_str_mv 9edbfc527815bb66f86cc54e5d14118e
bd573a5ca8288eb7272482765f819534
e39d27027a6cc9cb039ad269a5db8e34
1879dc0fb9f8a50ddae221d09028d452
1072daf3d37ec038cf50965514f1322d
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1802310674926272512