Penalized regression methods for compositional data
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Tese |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFSCAR |
Texto Completo: | https://repositorio.ufscar.br/handle/ufscar/11034 |
Resumo: | Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates. |
id |
SCAR_dec97a246905ed0de7fd815e363602c6 |
---|---|
oai_identifier_str |
oai:repositorio.ufscar.br:ufscar/11034 |
network_acronym_str |
SCAR |
network_name_str |
Repositório Institucional da UFSCAR |
repository_id_str |
4322 |
spelling |
Shimizu, Taciana Kisaki OliveiraLouzada Neto, Franciscohttp://lattes.cnpq.br/0994050156415890http://lattes.cnpq.br/4655747321002185ecdb2f22-21ba-4d41-9eb6-03c0b0a598262019-02-27T17:19:46Z2019-02-27T17:19:46Z2018-12-10SHIMIZU, Taciana Kisaki Oliveira. Penalized regression methods for compositional data. 2018. Tese (Doutorado em Estatística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/11034.https://repositorio.ufscar.br/handle/ufscar/11034Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates.Dados composicionais consistem em vetores conhecidos como composições cujos componentes são positivos e definidos no intervalo (0,1) representando proporções ou frações de um "todo'", sendo que a soma desses componentes totalizam um. Tais dados estão presentes em diferentes áreas, como na geologia, ecologia, economia, medicina entre outras. Desta forma, há um grande interesse em ampliar os conhecimentos acerca da modelagem de dados composicionais, principalmente quando há a influência de covariáveis nesse tipo de dado. Nesse contexto, a presente tese tem por objetivo propor uma nova abordagem de modelos de regressão aplicada em dados composicionais. A ideia central consiste no desenvolvimento de um método balizado por regressão penalizada, em particular Lasso, do inglês least absolute shrinkage and selection operator, elastic net e Spike-e-Slab Lasso (SSL) para a estimação dos parâmetros do modelo. Em particular, visionamos o desenvolvimento dessa modelagem para dados composicionais, com o número de variáveis explicativas excedendo o número de observações e na presença de grandes bases de dados, e além disso, quando há restrição na variável resposta e nas covariáveis.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)FAPESP: 14/16147-3engUniversidade Federal de São CarlosCâmpus São CarlosPrograma Interinstitucional de Pós-Graduação em Estatística - PIPGEsUFSCarDados composicionaisModelo de regressãoCoordenadas log-razão isométricasSeleção de variáveisCompositional dataRegression modelIsometric logratio coordinatesVariable selectionCIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAOPenalized regression methods for compositional dataMétodos de regressão penalizados para dados composicionaisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisOnline600600d0f3b31a-38c4-4c28-aa5b-837ad377108einfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALv__final_ufscar.pdfv__final_ufscar.pdfapplication/pdf1996002https://repositorio.ufscar.br/bitstream/ufscar/11034/1/v__final_ufscar.pdfa8008659b8efd772f5c8a4d30cbf1ea7MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-81957https://repositorio.ufscar.br/bitstream/ufscar/11034/4/license.txtae0398b6f8b235e40ad82cba6c50031dMD54TEXTv__final_ufscar.pdf.txtv__final_ufscar.pdf.txtExtracted texttext/plain128879https://repositorio.ufscar.br/bitstream/ufscar/11034/5/v__final_ufscar.pdf.txte868ce2257bdb01c8758f43e09d48017MD55THUMBNAILv__final_ufscar.pdf.jpgv__final_ufscar.pdf.jpgIM Thumbnailimage/jpeg5132https://repositorio.ufscar.br/bitstream/ufscar/11034/6/v__final_ufscar.pdf.jpgd0e3c70fdf707f831e80138af85f1c6fMD56ufscar/110342023-09-18 18:31:20.647oai:repositorio.ufscar.br:ufscar/11034TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgw6AgVW5pdmVyc2lkYWRlCkZlZGVyYWwgZGUgU8OjbyBDYXJsb3MgbyBkaXJlaXRvIG7Do28tZXhjbHVzaXZvIGRlIHJlcHJvZHV6aXIsICB0cmFkdXppciAoY29uZm9ybWUgZGVmaW5pZG8gYWJhaXhvKSwgZS9vdQpkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlCmVtIHF1YWxxdWVyIG1laW8sIGluY2x1aW5kbyBvcyBmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIGEgVUZTQ2FyIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28KcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBhIFVGU0NhciBwb2RlIG1hbnRlciBtYWlzIGRlIHVtYSBjw7NwaWEgYSBzdWEgdGVzZSBvdQpkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcwpuZXN0YSBsaWNlbsOnYS4gVm9jw6ogdGFtYsOpbSBkZWNsYXJhIHF1ZSBvIGRlcMOzc2l0byBkYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG7Do28sIHF1ZSBzZWphIGRlIHNldQpjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd1w6ltLgoKQ2FzbyBhIHN1YSB0ZXNlIG91IGRpc3NlcnRhw6fDo28gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY8OqIG7Do28gcG9zc3VpIGEgdGl0dWxhcmlkYWRlIGRvcyBkaXJlaXRvcyBhdXRvcmFpcywgdm9jw6oKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFVGU0NhcgpvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUKaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBURVNFIE9VIERJU1NFUlRBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBVRlNDYXIsClZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PClRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKQSBVRlNDYXIgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzCmNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7Dp2EuCg==Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:31:20Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false |
dc.title.eng.fl_str_mv |
Penalized regression methods for compositional data |
dc.title.alternative.por.fl_str_mv |
Métodos de regressão penalizados para dados composicionais |
title |
Penalized regression methods for compositional data |
spellingShingle |
Penalized regression methods for compositional data Shimizu, Taciana Kisaki Oliveira Dados composicionais Modelo de regressão Coordenadas log-razão isométricas Seleção de variáveis Compositional data Regression model Isometric logratio coordinates Variable selection CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO |
title_short |
Penalized regression methods for compositional data |
title_full |
Penalized regression methods for compositional data |
title_fullStr |
Penalized regression methods for compositional data |
title_full_unstemmed |
Penalized regression methods for compositional data |
title_sort |
Penalized regression methods for compositional data |
author |
Shimizu, Taciana Kisaki Oliveira |
author_facet |
Shimizu, Taciana Kisaki Oliveira |
author_role |
author |
dc.contributor.authorlattes.por.fl_str_mv |
http://lattes.cnpq.br/4655747321002185 |
dc.contributor.author.fl_str_mv |
Shimizu, Taciana Kisaki Oliveira |
dc.contributor.advisor1.fl_str_mv |
Louzada Neto, Francisco |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/0994050156415890 |
dc.contributor.authorID.fl_str_mv |
ecdb2f22-21ba-4d41-9eb6-03c0b0a59826 |
contributor_str_mv |
Louzada Neto, Francisco |
dc.subject.por.fl_str_mv |
Dados composicionais Modelo de regressão Coordenadas log-razão isométricas Seleção de variáveis |
topic |
Dados composicionais Modelo de regressão Coordenadas log-razão isométricas Seleção de variáveis Compositional data Regression model Isometric logratio coordinates Variable selection CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO |
dc.subject.eng.fl_str_mv |
Compositional data Regression model Isometric logratio coordinates Variable selection |
dc.subject.cnpq.fl_str_mv |
CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA::REGRESSAO E CORRELACAO |
description |
Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates. |
publishDate |
2018 |
dc.date.issued.fl_str_mv |
2018-12-10 |
dc.date.accessioned.fl_str_mv |
2019-02-27T17:19:46Z |
dc.date.available.fl_str_mv |
2019-02-27T17:19:46Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
SHIMIZU, Taciana Kisaki Oliveira. Penalized regression methods for compositional data. 2018. Tese (Doutorado em Estatística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/11034. |
dc.identifier.uri.fl_str_mv |
https://repositorio.ufscar.br/handle/ufscar/11034 |
identifier_str_mv |
SHIMIZU, Taciana Kisaki Oliveira. Penalized regression methods for compositional data. 2018. Tese (Doutorado em Estatística) – Universidade Federal de São Carlos, São Carlos, 2018. Disponível em: https://repositorio.ufscar.br/handle/ufscar/11034. |
url |
https://repositorio.ufscar.br/handle/ufscar/11034 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.confidence.fl_str_mv |
600 600 |
dc.relation.authority.fl_str_mv |
d0f3b31a-38c4-4c28-aa5b-837ad377108e |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus São Carlos |
dc.publisher.program.fl_str_mv |
Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs |
dc.publisher.initials.fl_str_mv |
UFSCar |
publisher.none.fl_str_mv |
Universidade Federal de São Carlos Câmpus São Carlos |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR |
instname_str |
Universidade Federal de São Carlos (UFSCAR) |
instacron_str |
UFSCAR |
institution |
UFSCAR |
reponame_str |
Repositório Institucional da UFSCAR |
collection |
Repositório Institucional da UFSCAR |
bitstream.url.fl_str_mv |
https://repositorio.ufscar.br/bitstream/ufscar/11034/1/v__final_ufscar.pdf https://repositorio.ufscar.br/bitstream/ufscar/11034/4/license.txt https://repositorio.ufscar.br/bitstream/ufscar/11034/5/v__final_ufscar.pdf.txt https://repositorio.ufscar.br/bitstream/ufscar/11034/6/v__final_ufscar.pdf.jpg |
bitstream.checksum.fl_str_mv |
a8008659b8efd772f5c8a4d30cbf1ea7 ae0398b6f8b235e40ad82cba6c50031d e868ce2257bdb01c8758f43e09d48017 d0e3c70fdf707f831e80138af85f1c6f |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR) |
repository.mail.fl_str_mv |
|
_version_ |
1813715599976562688 |