Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research

Detalhes bibliográficos
Autor(a) principal: Lozano,Manuel
Data de Publicação: 2018
Outros Autores: Manyes,Lara, Peiró,Juanjo, Iftimi,Adina, Ramada,José María
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Cadernos de Saúde Pública
Texto Completo: http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0102-311X2018000704001
Resumo: Multidisciplinary research in public health is approached using methods from many scientific disciplines. One of the main characteristics of this type of research is dealing with large data sets. Classic statistical variable selection methods, known as “screen and clean”, and used in a single-step, select the variables with greater explanatory weight in the model. These methods, commonly used in public health research, may induce masking and multicollinearity, excluding relevant variables for the experts in each discipline and skewing the result. Some specific techniques are used to solve this problem, such as penalized regressions and Bayesian statistics, they offer more balanced results among subsets of variables, but with less restrictive selection thresholds. Using a combination of classical methods, a three-step procedure is proposed in this manuscript, capturing the relevant variables of each scientific discipline, minimizing the selection of variables in each of them and obtaining a balanced distribution that explains most of the variability. This procedure was applied on a dataset from a public health research. Comparing the results with the single-step methods, the proposed method shows a greater reduction in the number of variables, as well as a balanced distribution among the scientific disciplines associated with the response variable. We propose an innovative procedure for variable selection and apply it to our dataset. Furthermore, we compare the new method with the classic single-step procedures.
id FIOCRUZ-5_91121cc8365887c979281b170fa46de2
oai_identifier_str oai:scielo:S0102-311X2018000704001
network_acronym_str FIOCRUZ-5
network_name_str Cadernos de Saúde Pública
repository_id_str
spelling Strategic procedure in three stages for the selection of variables to obtain balanced results in public health researchStatistics as TopicMethodsInterdisciplinary ResearchMultidisciplinary research in public health is approached using methods from many scientific disciplines. One of the main characteristics of this type of research is dealing with large data sets. Classic statistical variable selection methods, known as “screen and clean”, and used in a single-step, select the variables with greater explanatory weight in the model. These methods, commonly used in public health research, may induce masking and multicollinearity, excluding relevant variables for the experts in each discipline and skewing the result. Some specific techniques are used to solve this problem, such as penalized regressions and Bayesian statistics, they offer more balanced results among subsets of variables, but with less restrictive selection thresholds. Using a combination of classical methods, a three-step procedure is proposed in this manuscript, capturing the relevant variables of each scientific discipline, minimizing the selection of variables in each of them and obtaining a balanced distribution that explains most of the variability. This procedure was applied on a dataset from a public health research. Comparing the results with the single-step methods, the proposed method shows a greater reduction in the number of variables, as well as a balanced distribution among the scientific disciplines associated with the response variable. We propose an innovative procedure for variable selection and apply it to our dataset. Furthermore, we compare the new method with the classic single-step procedures.Escola Nacional de Saúde Pública Sergio Arouca, Fundação Oswaldo Cruz2018-01-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0102-311X2018000704001Cadernos de Saúde Pública v.34 n.7 2018reponame:Cadernos de Saúde Públicainstname:Fundação Oswaldo Cruz (FIOCRUZ)instacron:FIOCRUZ10.1590/0102-311x00174017info:eu-repo/semantics/openAccessLozano,ManuelManyes,LaraPeiró,JuanjoIftimi,AdinaRamada,José Maríaeng2018-08-29T00:00:00Zoai:scielo:S0102-311X2018000704001Revistahttp://cadernos.ensp.fiocruz.br/csp/https://old.scielo.br/oai/scielo-oai.phpcadernos@ensp.fiocruz.br||cadernos@ensp.fiocruz.br1678-44640102-311Xopendoar:2018-08-29T00:00Cadernos de Saúde Pública - Fundação Oswaldo Cruz (FIOCRUZ)false
dc.title.none.fl_str_mv Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
title Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
spellingShingle Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
Lozano,Manuel
Statistics as Topic
Methods
Interdisciplinary Research
title_short Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
title_full Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
title_fullStr Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
title_full_unstemmed Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
title_sort Strategic procedure in three stages for the selection of variables to obtain balanced results in public health research
author Lozano,Manuel
author_facet Lozano,Manuel
Manyes,Lara
Peiró,Juanjo
Iftimi,Adina
Ramada,José María
author_role author
author2 Manyes,Lara
Peiró,Juanjo
Iftimi,Adina
Ramada,José María
author2_role author
author
author
author
dc.contributor.author.fl_str_mv Lozano,Manuel
Manyes,Lara
Peiró,Juanjo
Iftimi,Adina
Ramada,José María
dc.subject.por.fl_str_mv Statistics as Topic
Methods
Interdisciplinary Research
topic Statistics as Topic
Methods
Interdisciplinary Research
description Multidisciplinary research in public health is approached using methods from many scientific disciplines. One of the main characteristics of this type of research is dealing with large data sets. Classic statistical variable selection methods, known as “screen and clean”, and used in a single-step, select the variables with greater explanatory weight in the model. These methods, commonly used in public health research, may induce masking and multicollinearity, excluding relevant variables for the experts in each discipline and skewing the result. Some specific techniques are used to solve this problem, such as penalized regressions and Bayesian statistics, they offer more balanced results among subsets of variables, but with less restrictive selection thresholds. Using a combination of classical methods, a three-step procedure is proposed in this manuscript, capturing the relevant variables of each scientific discipline, minimizing the selection of variables in each of them and obtaining a balanced distribution that explains most of the variability. This procedure was applied on a dataset from a public health research. Comparing the results with the single-step methods, the proposed method shows a greater reduction in the number of variables, as well as a balanced distribution among the scientific disciplines associated with the response variable. We propose an innovative procedure for variable selection and apply it to our dataset. Furthermore, we compare the new method with the classic single-step procedures.
publishDate 2018
dc.date.none.fl_str_mv 2018-01-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0102-311X2018000704001
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0102-311X2018000704001
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1590/0102-311x00174017
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv text/html
dc.publisher.none.fl_str_mv Escola Nacional de Saúde Pública Sergio Arouca, Fundação Oswaldo Cruz
publisher.none.fl_str_mv Escola Nacional de Saúde Pública Sergio Arouca, Fundação Oswaldo Cruz
dc.source.none.fl_str_mv Cadernos de Saúde Pública v.34 n.7 2018
reponame:Cadernos de Saúde Pública
instname:Fundação Oswaldo Cruz (FIOCRUZ)
instacron:FIOCRUZ
instname_str Fundação Oswaldo Cruz (FIOCRUZ)
instacron_str FIOCRUZ
institution FIOCRUZ
reponame_str Cadernos de Saúde Pública
collection Cadernos de Saúde Pública
repository.name.fl_str_mv Cadernos de Saúde Pública - Fundação Oswaldo Cruz (FIOCRUZ)
repository.mail.fl_str_mv cadernos@ensp.fiocruz.br||cadernos@ensp.fiocruz.br
_version_ 1754115738906394624