Influential diagnostics for location parameter within GAMLSS
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFPE |
Texto Completo: | https://repositorio.ufpe.br/handle/123456789/40169 |
Resumo: | Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology. |
id |
UFPE_acad6d3675b5d86a9c91e9c390ae8624 |
---|---|
oai_identifier_str |
oai:repositorio.ufpe.br:123456789/40169 |
network_acronym_str |
UFPE |
network_name_str |
Repositório Institucional da UFPE |
repository_id_str |
2221 |
spelling |
SILVA, Lucas Araújo dahttp://lattes.cnpq.br/7987821215029063http://lattes.cnpq.br/5519064508209103DE BASTIANI, Fernanda2021-05-25T13:11:48Z2021-05-25T13:11:48Z2021-02-18SILVA, Lucas Araújo da. Influential diagnostics for location parameter within GAMLSS. 2021. Dissertação (Mestrado em Estatística) - Universidade Federal de Pernambuco, Recife, 2021.https://repositorio.ufpe.br/handle/123456789/40169Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology.CAPESModelar a relação funcional entre uma váriável resposta e um conjunto de variáveis ex plicativas é o cerne dos problemas de regressão em estatística. Diversos estudos tem propostos diferentes modelos. Mais recentemente os modelos aditivos generalizados para locação escala e forma (GAMLSS) tem ganhado atenção por generalizar outros modelos já populares como o modelo linear, os modelos lineares generalizados, modelos semiparamétricos e os modelos aditivos generalizados, e permitir qualquer distribuição paramétrica para modelar a variável resposta. Além disso, todos os parâmetros da distribuição podem ser modelados com funções lineares, não lineares ou funções de suavização das variáveis explicativas. Várias ferramentas de diagnósticos de influência tem sido propostas na literatura, e este trabalho mostra algumas dessas ferramentas e propõe técnicas para detectar possíveis observações influentes na classe de modelos GAMLSS. Este trabalho considera diversas medidas de influência como: a distân cia de Cook generalizada, o afastamento de verossimilhanças, a medida de Peña ajustada, diferenças do critério de informação de Akaike generalizada e a medida de Kim para dados simulados e aplicações. É proposto ainda algoritmos para obter os valores de referência destas medidas utilizando bootstrap, adaptando para as outras medidas o procedimento sugerido por Kim et al. (2002). O estudo ainda limita-se a situações que se é modelado o parâmetro de locação (em geral a média) da variável resposta, incluindo ou não termos aditivos de suaviza ção, neste caso utilizou-se splines penalizados univariados como suavizador, já que a medida de Peña e de Kim necessitam do cálculo da matriz de suavização que varia de acordo com a covariável suavizada e o suavizador em questão. Para os estudos de simulação, foram consid erados diversos cenários com algumas distribuições relevantes e diversos tamanhos amostrais, considerando distribuições tanto de natureza contínua quanto discretas. Análise de dados reais ilustram a metodologia abordada.engUniversidade Federal de PernambucoPrograma de Pos Graduacao em EstatisticaUFPEBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessEstatística AplicadaBootstrapDistância de CookInfluential diagnostics for location parameter within GAMLSSinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesismestradoreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPEORIGINALDISSERTAÇÃO Lucas Araújo da Silva.pdfDISSERTAÇÃO Lucas Araújo da Silva.pdfapplication/pdf1065207https://repositorio.ufpe.br/bitstream/123456789/40169/1/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf9edbfc527815bb66f86cc54e5d14118eMD51LICENSElicense.txtlicense.txttext/plain; charset=utf-82310https://repositorio.ufpe.br/bitstream/123456789/40169/3/license.txtbd573a5ca8288eb7272482765f819534MD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufpe.br/bitstream/123456789/40169/2/license_rdfe39d27027a6cc9cb039ad269a5db8e34MD52TEXTDISSERTAÇÃO Lucas Araújo da Silva.pdf.txtDISSERTAÇÃO Lucas Araújo da Silva.pdf.txtExtracted texttext/plain110610https://repositorio.ufpe.br/bitstream/123456789/40169/4/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.txt1879dc0fb9f8a50ddae221d09028d452MD54THUMBNAILDISSERTAÇÃO Lucas Araújo da Silva.pdf.jpgDISSERTAÇÃO Lucas Araújo da Silva.pdf.jpgGenerated Thumbnailimage/jpeg1175https://repositorio.ufpe.br/bitstream/123456789/40169/5/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.jpg1072daf3d37ec038cf50965514f1322dMD55123456789/401692021-05-26 02:16:16.02oai:repositorio.ufpe.br:123456789/40169TGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKClRvZG8gZGVwb3NpdGFudGUgZGUgbWF0ZXJpYWwgbm8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgKFJJKSBkZXZlIGNvbmNlZGVyLCDDoCBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkZSBQZXJuYW1idWNvIChVRlBFKSwgdW1hIExpY2Vuw6dhIGRlIERpc3RyaWJ1acOnw6NvIE7Do28gRXhjbHVzaXZhIHBhcmEgbWFudGVyIGUgdG9ybmFyIGFjZXNzw612ZWlzIG9zIHNldXMgZG9jdW1lbnRvcywgZW0gZm9ybWF0byBkaWdpdGFsLCBuZXN0ZSByZXBvc2l0w7NyaW8uCgpDb20gYSBjb25jZXNzw6NvIGRlc3RhIGxpY2Vuw6dhIG7Do28gZXhjbHVzaXZhLCBvIGRlcG9zaXRhbnRlIG1hbnTDqW0gdG9kb3Mgb3MgZGlyZWl0b3MgZGUgYXV0b3IuCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwoKTGljZW7Dp2EgZGUgRGlzdHJpYnVpw6fDo28gTsOjbyBFeGNsdXNpdmEKCkFvIGNvbmNvcmRhciBjb20gZXN0YSBsaWNlbsOnYSBlIGFjZWl0w6EtbGEsIHZvY8OqIChhdXRvciBvdSBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMpOgoKYSkgRGVjbGFyYSBxdWUgY29uaGVjZSBhIHBvbMOtdGljYSBkZSBjb3B5cmlnaHQgZGEgZWRpdG9yYSBkbyBzZXUgZG9jdW1lbnRvOwpiKSBEZWNsYXJhIHF1ZSBjb25oZWNlIGUgYWNlaXRhIGFzIERpcmV0cml6ZXMgcGFyYSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGUEU7CmMpIENvbmNlZGUgw6AgVUZQRSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZGUgYXJxdWl2YXIsIHJlcHJvZHV6aXIsIGNvbnZlcnRlciAoY29tbyBkZWZpbmlkbyBhIHNlZ3VpciksIGNvbXVuaWNhciBlL291IGRpc3RyaWJ1aXIsIG5vIFJJLCBvIGRvY3VtZW50byBlbnRyZWd1ZSAoaW5jbHVpbmRvIG8gcmVzdW1vL2Fic3RyYWN0KSBlbSBmb3JtYXRvIGRpZ2l0YWwgb3UgcG9yIG91dHJvIG1laW87CmQpIERlY2xhcmEgcXVlIGF1dG9yaXphIGEgVUZQRSBhIGFycXVpdmFyIG1haXMgZGUgdW1hIGPDs3BpYSBkZXN0ZSBkb2N1bWVudG8gZSBjb252ZXJ0w6otbG8sIHNlbSBhbHRlcmFyIG8gc2V1IGNvbnRlw7pkbywgcGFyYSBxdWFscXVlciBmb3JtYXRvIGRlIGZpY2hlaXJvLCBtZWlvIG91IHN1cG9ydGUsIHBhcmEgZWZlaXRvcyBkZSBzZWd1cmFuw6dhLCBwcmVzZXJ2YcOnw6NvIChiYWNrdXApIGUgYWNlc3NvOwplKSBEZWNsYXJhIHF1ZSBvIGRvY3VtZW50byBzdWJtZXRpZG8gw6kgbyBzZXUgdHJhYmFsaG8gb3JpZ2luYWwgZSBxdWUgZGV0w6ltIG8gZGlyZWl0byBkZSBjb25jZWRlciBhIHRlcmNlaXJvcyBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBhIGVudHJlZ2EgZG8gZG9jdW1lbnRvIG7Do28gaW5mcmluZ2Ugb3MgZGlyZWl0b3MgZGUgb3V0cmEgcGVzc29hIG91IGVudGlkYWRlOwpmKSBEZWNsYXJhIHF1ZSwgbm8gY2FzbyBkbyBkb2N1bWVudG8gc3VibWV0aWRvIGNvbnRlciBtYXRlcmlhbCBkbyBxdWFsIG7Do28gZGV0w6ltIG9zIGRpcmVpdG9zIGRlCmF1dG9yLCBvYnRldmUgYSBhdXRvcml6YcOnw6NvIGlycmVzdHJpdGEgZG8gcmVzcGVjdGl2byBkZXRlbnRvciBkZXNzZXMgZGlyZWl0b3MgcGFyYSBjZWRlciDDoApVRlBFIG9zIGRpcmVpdG9zIHJlcXVlcmlkb3MgcG9yIGVzdGEgTGljZW7Dp2EgZSBhdXRvcml6YXIgYSB1bml2ZXJzaWRhZGUgYSB1dGlsaXrDoS1sb3MgbGVnYWxtZW50ZS4gRGVjbGFyYSB0YW1iw6ltIHF1ZSBlc3NlIG1hdGVyaWFsIGN1am9zIGRpcmVpdG9zIHPDo28gZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3UgY29udGXDumRvIGRvIGRvY3VtZW50byBlbnRyZWd1ZTsKZykgU2UgbyBkb2N1bWVudG8gZW50cmVndWUgw6kgYmFzZWFkbyBlbSB0cmFiYWxobyBmaW5hbmNpYWRvIG91IGFwb2lhZG8gcG9yIG91dHJhIGluc3RpdHVpw6fDo28gcXVlIG7Do28gYSBVRlBFLCBkZWNsYXJhIHF1ZSBjdW1wcml1IHF1YWlzcXVlciBvYnJpZ2HDp8O1ZXMgZXhpZ2lkYXMgcGVsbyByZXNwZWN0aXZvIGNvbnRyYXRvIG91IGFjb3Jkby4KCkEgVUZQRSBpZGVudGlmaWNhcsOhIGNsYXJhbWVudGUgbyhzKSBub21lKHMpIGRvKHMpIGF1dG9yIChlcykgZG9zIGRpcmVpdG9zIGRvIGRvY3VtZW50byBlbnRyZWd1ZSBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIHBhcmEgYWzDqW0gZG8gcHJldmlzdG8gbmEgYWzDrW5lYSBjKS4KRepositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212021-05-26T05:16:16Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false |
dc.title.pt_BR.fl_str_mv |
Influential diagnostics for location parameter within GAMLSS |
title |
Influential diagnostics for location parameter within GAMLSS |
spellingShingle |
Influential diagnostics for location parameter within GAMLSS SILVA, Lucas Araújo da Estatística Aplicada Bootstrap Distância de Cook |
title_short |
Influential diagnostics for location parameter within GAMLSS |
title_full |
Influential diagnostics for location parameter within GAMLSS |
title_fullStr |
Influential diagnostics for location parameter within GAMLSS |
title_full_unstemmed |
Influential diagnostics for location parameter within GAMLSS |
title_sort |
Influential diagnostics for location parameter within GAMLSS |
author |
SILVA, Lucas Araújo da |
author_facet |
SILVA, Lucas Araújo da |
author_role |
author |
dc.contributor.authorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/7987821215029063 |
dc.contributor.advisorLattes.pt_BR.fl_str_mv |
http://lattes.cnpq.br/5519064508209103 |
dc.contributor.author.fl_str_mv |
SILVA, Lucas Araújo da |
dc.contributor.advisor1.fl_str_mv |
DE BASTIANI, Fernanda |
contributor_str_mv |
DE BASTIANI, Fernanda |
dc.subject.por.fl_str_mv |
Estatística Aplicada Bootstrap Distância de Cook |
topic |
Estatística Aplicada Bootstrap Distância de Cook |
description |
Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology. |
publishDate |
2021 |
dc.date.accessioned.fl_str_mv |
2021-05-25T13:11:48Z |
dc.date.available.fl_str_mv |
2021-05-25T13:11:48Z |
dc.date.issued.fl_str_mv |
2021-02-18 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
SILVA, Lucas Araújo da. Influential diagnostics for location parameter within GAMLSS. 2021. Dissertação (Mestrado em Estatística) - Universidade Federal de Pernambuco, Recife, 2021. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufpe.br/handle/123456789/40169 |
identifier_str_mv |
SILVA, Lucas Araújo da. Influential diagnostics for location parameter within GAMLSS. 2021. Dissertação (Mestrado em Estatística) - Universidade Federal de Pernambuco, Recife, 2021. |
url |
https://repositorio.ufpe.br/handle/123456789/40169 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/br/ info:eu-repo/semantics/embargoedAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/br/ |
eu_rights_str_mv |
embargoedAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.publisher.program.fl_str_mv |
Programa de Pos Graduacao em Estatistica |
dc.publisher.initials.fl_str_mv |
UFPE |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Universidade Federal de Pernambuco |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFPE instname:Universidade Federal de Pernambuco (UFPE) instacron:UFPE |
instname_str |
Universidade Federal de Pernambuco (UFPE) |
instacron_str |
UFPE |
institution |
UFPE |
reponame_str |
Repositório Institucional da UFPE |
collection |
Repositório Institucional da UFPE |
bitstream.url.fl_str_mv |
https://repositorio.ufpe.br/bitstream/123456789/40169/1/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf https://repositorio.ufpe.br/bitstream/123456789/40169/3/license.txt https://repositorio.ufpe.br/bitstream/123456789/40169/2/license_rdf https://repositorio.ufpe.br/bitstream/123456789/40169/4/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.txt https://repositorio.ufpe.br/bitstream/123456789/40169/5/DISSERTA%c3%87%c3%83O%20Lucas%20Ara%c3%bajo%20da%20Silva.pdf.jpg |
bitstream.checksum.fl_str_mv |
9edbfc527815bb66f86cc54e5d14118e bd573a5ca8288eb7272482765f819534 e39d27027a6cc9cb039ad269a5db8e34 1879dc0fb9f8a50ddae221d09028d452 1072daf3d37ec038cf50965514f1322d |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE) |
repository.mail.fl_str_mv |
attena@ufpe.br |
_version_ |
1802310674926272512 |