Penalized regression methods for compositional data

Detalhes bibliográficos
Autor(a) principal: Shimizu, Taciana Kisaki Oliveira
Data de Publicação: 2018
Tipo de documento: Tese
Idioma: eng
Título da fonte: Repositório Institucional da UFSCAR
Texto Completo: https://repositorio.ufscar.br/handle/ufscar/11034
Resumo: Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates.
id SCAR_dec97a246905ed0de7fd815e363602c6
oai_identifier_str oai:repositorio.ufscar.br:ufscar/11034
network_acronym_str SCAR
network_name_str Repositório Institucional da UFSCAR
repository_id_str 4322
spelling Shimizu, Taciana Kisaki OliveiraLouzada Neto, Franciscohttp://lattes.cnpq.br/0994050156415890http://lattes.cnpq.br/4655747321002185ecdb2f22-21ba-4d41-9eb6-03c0b0a598262019-02-27T17:19:46Z2019-02-27T17:19:46Z2018-12-10SHIMIZU, Taciana Kisaki Oliveira. Penalized regression methods for compositional data. 2018. Tese (Doutorado em Estatística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/11034.https://repositorio.ufscar.br/handle/ufscar/11034Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates.Dados composicionais consistem em vetores conhecidos como composições cujos componentes são positivos e definidos no intervalo (0,1) representando proporções ou frações de um "todo'", sendo que a soma desses componentes totalizam um. Tais dados estão presentes em diferentes áreas, como na geologia, ecologia, economia, medicina entre outras. Desta forma, há um grande interesse em ampliar os conhecimentos acerca da modelagem de dados composicionais, principalmente quando há a influência de covariáveis nesse tipo de dado. Nesse contexto, a presente tese tem por objetivo propor uma nova abordagem de modelos de regressão aplicada em dados composicionais. A ideia central consiste no desenvolvimento de um método balizado por regressão penalizada, em particular Lasso, do inglês least absolute shrinkage and selection operator, elastic net e Spike-e-Slab Lasso (SSL) para a estimação dos parâmetros do modelo. Em particular, visionamos o desenvolvimento dessa modelagem para dados composicionais, com o número de variáveis explicativas excedendo o número de observações e na presença de grandes bases de dados, e além disso, quando há restrição na variável resposta e nas covariáveis.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)FAPESP: 14/16147-3engUniversidade Federal de São CarlosCâmpus São CarlosPrograma Interinstitucional de Pós-Graduação em Estatística - PIPGEsUFSCarDados composicionaisModelo de regressãoCoordenadas log-razão isométricasSeleção de variáveisCompositional dataRegression modelIsometric logratio coordinatesVariable selectionCIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAOPenalized regression methods for compositional dataMétodos de regressão penalizados para dados composicionaisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisOnline600600d0f3b31a-38c4-4c28-aa5b-837ad377108einfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALv__final_ufscar.pdfv__final_ufscar.pdfapplication/pdf1996002https://repositorio.ufscar.br/bitstream/ufscar/11034/1/v__final_ufscar.pdfa8008659b8efd772f5c8a4d30cbf1ea7MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstream/ufscar/11034/4/license.txtae0398b6f8b235e40ad82cba6c50031dMD54TEXTv__final_ufscar.pdf.txtv__final_ufscar.pdf.txtExtracted texttext/plain128879https://repositorio.ufscar.br/bitstream/ufscar/11034/5/v__final_ufscar.pdf.txte868ce2257bdb01c8758f43e09d48017MD55THUMBNAILv__final_ufscar.pdf.jpgv__final_ufscar.pdf.jpgIM Thumbnailimage/jpeg5132https://repositorio.ufscar.br/bitstream/ufscar/11034/6/v__final_ufscar.pdf.jpgd0e3c70fdf707f831e80138af85f1c6fMD56ufscar/110342023-09-18 18:31:20.647oai:repositorio.ufscar.br:ufscar/11034TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg==Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:31:20Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.eng.fl_str_mv Penalized regression methods for compositional data
dc.title.alternative.por.fl_str_mv Métodos de regressão penalizados para dados composicionais
title Penalized regression methods for compositional data
spellingShingle Penalized regression methods for compositional data
Shimizu, Taciana Kisaki Oliveira
Dados composicionais
Modelo de regressão
Coordenadas log-razão isométricas
Seleção de variáveis
Compositional data
Regression model
Isometric logratio coordinates
Variable selection
CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO
title_short Penalized regression methods for compositional data
title_full Penalized regression methods for compositional data
title_fullStr Penalized regression methods for compositional data
title_full_unstemmed Penalized regression methods for compositional data
title_sort Penalized regression methods for compositional data
author Shimizu, Taciana Kisaki Oliveira
author_facet Shimizu, Taciana Kisaki Oliveira
author_role author
dc.contributor.authorlattes.por.fl_str_mv http://lattes.cnpq.br/4655747321002185
dc.contributor.author.fl_str_mv Shimizu, Taciana Kisaki Oliveira
dc.contributor.advisor1.fl_str_mv Louzada Neto, Francisco
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/0994050156415890
dc.contributor.authorID.fl_str_mv ecdb2f22-21ba-4d41-9eb6-03c0b0a59826
contributor_str_mv Louzada Neto, Francisco
dc.subject.por.fl_str_mv Dados composicionais
Modelo de regressão
Coordenadas log-razão isométricas
Seleção de variáveis
topic Dados composicionais
Modelo de regressão
Coordenadas log-razão isométricas
Seleção de variáveis
Compositional data
Regression model
Isometric logratio coordinates
Variable selection
CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO
dc.subject.eng.fl_str_mv Compositional data
Regression model
Isometric logratio coordinates
Variable selection
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO
description Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates.
publishDate 2018
dc.date.issued.fl_str_mv 2018-12-10
dc.date.accessioned.fl_str_mv 2019-02-27T17:19:46Z
dc.date.available.fl_str_mv 2019-02-27T17:19:46Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv SHIMIZU, Taciana Kisaki Oliveira. Penalized regression methods for compositional data. 2018. Tese (Doutorado em Estatística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/11034.
dc.identifier.uri.fl_str_mv https://repositorio.ufscar.br/handle/ufscar/11034
identifier_str_mv SHIMIZU, Taciana Kisaki Oliveira. Penalized regression methods for compositional data. 2018. Tese (Doutorado em Estatística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/11034.
url https://repositorio.ufscar.br/handle/ufscar/11034
dc.language.iso.fl_str_mv eng
language eng
dc.relation.confidence.fl_str_mv 600
600
dc.relation.authority.fl_str_mv d0f3b31a-38c4-4c28-aa5b-837ad377108e
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.publisher.program.fl_str_mv Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs
dc.publisher.initials.fl_str_mv UFSCar
publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFSCAR
instname:Universidade Federal de São Carlos (UFSCAR)
instacron:UFSCAR
instname_str Universidade Federal de São Carlos (UFSCAR)
instacron_str UFSCAR
institution UFSCAR
reponame_str Repositório Institucional da UFSCAR
collection Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv https://repositorio.ufscar.br/bitstream/ufscar/11034/1/v__final_ufscar.pdf
https://repositorio.ufscar.br/bitstream/ufscar/11034/4/license.txt
https://repositorio.ufscar.br/bitstream/ufscar/11034/5/v__final_ufscar.pdf.txt
https://repositorio.ufscar.br/bitstream/ufscar/11034/6/v__final_ufscar.pdf.jpg
bitstream.checksum.fl_str_mv a8008659b8efd772f5c8a4d30cbf1ea7
ae0398b6f8b235e40ad82cba6c50031d
e868ce2257bdb01c8758f43e09d48017
d0e3c70fdf707f831e80138af85f1c6f
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv
_version_ 1813715599976562688