A new class of discrete models for the analysis of zero-modified count data
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da USP |
Texto Completo: | https://www.teses.usp.br/teses/disponiveis/104/104131/tde-29072020-093809/ |
Resumo: | In this work, a new class of discrete models for the analysis of zero-modified count data has been introduced. The proposed class is composed of hurdle versions of the Poisson-Lindley, PoissonShanker, and Poisson-Sujatha baseline distributions, which are uniparametric Poisson mixtures that can accommodate different levels of overdispersion. Unlike the traditional formulation of zero-modified distributions, the primary assumption under hurdle models is that the positive observations are entirely represented by zero-truncated distributions. In the sense of extending the applicability of the theoretical models, it has also been developed a fixed-effects regression framework, in which the probability of zero-valued observations being generated as well as the average number of positive observations per individual could be modeled in the presence of covariates. Besides, an even more flexible structure allowing the inclusion of both fixed and random-effects in the linear predictors of the hurdle models has also been developed. In the derived mixed-effects structure, it has been considered the use of scalar random-effects to quantify the within-subjects heterogeneity arising from clustering or repeated measurements. In this work, all inferential procedures were conducted under a fully Bayesian perspective. Different prior distributions have been considered (e.g., Jeffreys and g-prior), and the task of generating pseudo-random values from a posterior distribution without closed-form has been performed by one out of the three following algorithms (depending on the structure of each model): Rejection Sampling, Random-walk Metropolis, and Adaptive Metropolis. Intensive Monte Carlo simulation studies were performed in order to evaluate the performance of the adopted Bayesian methodologies. The usefulness of the proposed zero-modified models was illustrated by using several real datasets presenting different structures and sources of variation. Beyond parameter estimation, it has been performed sensitivity analyses to identify influent points, and, in order to evaluate the fitted models, it has been computed the Bayesian p-values, the randomized quantile residuals, among other measures. Finally, when compared with wellestablished distributions for the analysis of count data, the competitiveness of the proposed models has been proved in all provided examples. |
id |
USP_551a554fc5988cb0dcd0754ae50fad32 |
---|---|
oai_identifier_str |
oai:teses.usp.br:tde-29072020-093809 |
network_acronym_str |
USP |
network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
repository_id_str |
2721 |
spelling |
A new class of discrete models for the analysis of zero-modified count dataUma nova classe de modelos discretos para a análise de dados de contagem zero-modificadosBayesian methodsDados zero modificadosDistribuições de mistura de PoissonMétodos BayesianosMixed-effects hurdle modelsModelo hurdle com efeitos mistosOverdispersionPoisson mixture distributionsSobredispersãoZero-modified dataIn this work, a new class of discrete models for the analysis of zero-modified count data has been introduced. The proposed class is composed of hurdle versions of the Poisson-Lindley, PoissonShanker, and Poisson-Sujatha baseline distributions, which are uniparametric Poisson mixtures that can accommodate different levels of overdispersion. Unlike the traditional formulation of zero-modified distributions, the primary assumption under hurdle models is that the positive observations are entirely represented by zero-truncated distributions. In the sense of extending the applicability of the theoretical models, it has also been developed a fixed-effects regression framework, in which the probability of zero-valued observations being generated as well as the average number of positive observations per individual could be modeled in the presence of covariates. Besides, an even more flexible structure allowing the inclusion of both fixed and random-effects in the linear predictors of the hurdle models has also been developed. In the derived mixed-effects structure, it has been considered the use of scalar random-effects to quantify the within-subjects heterogeneity arising from clustering or repeated measurements. In this work, all inferential procedures were conducted under a fully Bayesian perspective. Different prior distributions have been considered (e.g., Jeffreys and g-prior), and the task of generating pseudo-random values from a posterior distribution without closed-form has been performed by one out of the three following algorithms (depending on the structure of each model): Rejection Sampling, Random-walk Metropolis, and Adaptive Metropolis. Intensive Monte Carlo simulation studies were performed in order to evaluate the performance of the adopted Bayesian methodologies. The usefulness of the proposed zero-modified models was illustrated by using several real datasets presenting different structures and sources of variation. Beyond parameter estimation, it has been performed sensitivity analyses to identify influent points, and, in order to evaluate the fitted models, it has been computed the Bayesian p-values, the randomized quantile residuals, among other measures. Finally, when compared with wellestablished distributions for the analysis of count data, the competitiveness of the proposed models has been proved in all provided examples.Neste trabalho, uma nova classe de modelos discretos para a análise de contagens zero modificados foi introduzida. A classe proposta é composta pelas versões hurdle das distribuições de Poisson-Lindley, Poisson-Shanker e Poisson-Sujatha, que são misturas uniparamétricas de Poisson, capazes de acomodar diferentes níveis de sobredispersão. Diferentemente da formulação tradicional das distribuições zero modificadas, a principal suposição acerca de um modelo hurdle é que as observações positivas são inteiramente representadas por distribuições zero-truncadas. No sentido de estender a aplicabilidade dos modelos teóricos, também foi desenvolvida uma estrutura de regressão com efeitos fixos, na qual tanto a probabilidade de se observar o valor zero, quanto o número médio de observações positivas por indivíduo, puderam ser modelados na presença de covariáveis. Além disso, também foi desenvolvida uma estrutura ainda mais flexível, permitindo a inclusão simultânea de efeitos fixos e aleatórios nos preditores lineares do modelo hurdle. Na estrutura de efeitos mistos derivada, considerou-se o uso de efeitos aleatórios escalares para quantificar a heterogeneidade entre as observações de um mesmo indivíduo, que decorre de agrupamentos ou medidas repetidas. Neste trabalho, todos os procedimentos inferenciais foram conduzidos sob uma perspectiva totalmente Bayesiana. Diferentes distribuições a priori foram consideradas (por exemplo, Jeffreys e g-prior), e a tarefa de gerar valores pseudo-aleatórios de uma distribuição a posteriori sem forma fechada foi realizada por um dos três algoritmos a seguir (dependendo da estrutura de cada modelo): Amostragem por Rejeição, Random-walk Metropolis, e Metropolis Adaptativo. Estudos intensivos de simulação de Monte Carlo foram realizados como forma de avaliar o desempenho das metodologias Bayesianas adotadas. A utilidade dos modelos zero modificados propostos foi ilustrada usando vários conjuntos de dados reais que apresentavam diferentes estruturas e fontes de variação. Além de estimar os parâmetros, foram realizadas análises de sensibilidade para identificar pontos influentes e, para avaliar os modelos ajustados, foram computados os p-valores Bayesianos, os resíduos quantílicos aleatorizados, entre outras medidas. Por fim, quando comparados com distribuições bem estabelecidas que são úteis para a análise de dados de contagem, a competitividade dos modelos propostos foi comprovada em todos os exemplos fornecidos.Biblioteca Digitais de Teses e Dissertações da USPLouzada Neto, FranciscoSilva, Wesley Bertoli da2020-04-03info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/104/104131/tde-29072020-093809/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2020-08-13T00:48:14Zoai:teses.usp.br:tde-29072020-093809Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212020-08-13T00:48:14Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
dc.title.none.fl_str_mv |
A new class of discrete models for the analysis of zero-modified count data Uma nova classe de modelos discretos para a análise de dados de contagem zero-modificados |
title |
A new class of discrete models for the analysis of zero-modified count data |
spellingShingle |
A new class of discrete models for the analysis of zero-modified count data Silva, Wesley Bertoli da Bayesian methods Dados zero modificados Distribuições de mistura de Poisson Métodos Bayesianos Mixed-effects hurdle models Modelo hurdle com efeitos mistos Overdispersion Poisson mixture distributions Sobredispersão Zero-modified data |
title_short |
A new class of discrete models for the analysis of zero-modified count data |
title_full |
A new class of discrete models for the analysis of zero-modified count data |
title_fullStr |
A new class of discrete models for the analysis of zero-modified count data |
title_full_unstemmed |
A new class of discrete models for the analysis of zero-modified count data |
title_sort |
A new class of discrete models for the analysis of zero-modified count data |
author |
Silva, Wesley Bertoli da |
author_facet |
Silva, Wesley Bertoli da |
author_role |
author |
dc.contributor.none.fl_str_mv |
Louzada Neto, Francisco |
dc.contributor.author.fl_str_mv |
Silva, Wesley Bertoli da |
dc.subject.por.fl_str_mv |
Bayesian methods Dados zero modificados Distribuições de mistura de Poisson Métodos Bayesianos Mixed-effects hurdle models Modelo hurdle com efeitos mistos Overdispersion Poisson mixture distributions Sobredispersão Zero-modified data |
topic |
Bayesian methods Dados zero modificados Distribuições de mistura de Poisson Métodos Bayesianos Mixed-effects hurdle models Modelo hurdle com efeitos mistos Overdispersion Poisson mixture distributions Sobredispersão Zero-modified data |
description |
In this work, a new class of discrete models for the analysis of zero-modified count data has been introduced. The proposed class is composed of hurdle versions of the Poisson-Lindley, PoissonShanker, and Poisson-Sujatha baseline distributions, which are uniparametric Poisson mixtures that can accommodate different levels of overdispersion. Unlike the traditional formulation of zero-modified distributions, the primary assumption under hurdle models is that the positive observations are entirely represented by zero-truncated distributions. In the sense of extending the applicability of the theoretical models, it has also been developed a fixed-effects regression framework, in which the probability of zero-valued observations being generated as well as the average number of positive observations per individual could be modeled in the presence of covariates. Besides, an even more flexible structure allowing the inclusion of both fixed and random-effects in the linear predictors of the hurdle models has also been developed. In the derived mixed-effects structure, it has been considered the use of scalar random-effects to quantify the within-subjects heterogeneity arising from clustering or repeated measurements. In this work, all inferential procedures were conducted under a fully Bayesian perspective. Different prior distributions have been considered (e.g., Jeffreys and g-prior), and the task of generating pseudo-random values from a posterior distribution without closed-form has been performed by one out of the three following algorithms (depending on the structure of each model): Rejection Sampling, Random-walk Metropolis, and Adaptive Metropolis. Intensive Monte Carlo simulation studies were performed in order to evaluate the performance of the adopted Bayesian methodologies. The usefulness of the proposed zero-modified models was illustrated by using several real datasets presenting different structures and sources of variation. Beyond parameter estimation, it has been performed sensitivity analyses to identify influent points, and, in order to evaluate the fitted models, it has been computed the Bayesian p-values, the randomized quantile residuals, among other measures. Finally, when compared with wellestablished distributions for the analysis of count data, the competitiveness of the proposed models has been proved in all provided examples. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-04-03 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/104/104131/tde-29072020-093809/ |
url |
https://www.teses.usp.br/teses/disponiveis/104/104131/tde-29072020-093809/ |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
|
dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.coverage.none.fl_str_mv |
|
dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
instname_str |
Universidade de São Paulo (USP) |
instacron_str |
USP |
institution |
USP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
collection |
Biblioteca Digital de Teses e Dissertações da USP |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
_version_ |
1809091179297374208 |