Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity

Detalhes bibliográficos
Autor(a) principal: de Andrade, Luis Gustavo Modelli [UNESP]
Data de Publicação: 2020
Outros Autores: Tedesco-Silva, Helio
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1371/journal.pone.0228842
http://hdl.handle.net/11449/198507
Resumo: Background One overlooked problem in statistical analysis is lateral collinearity, a phenomenon that may occur when the outcome variable derives from the predictors. In nephrology this issue is seen with the use of estimated glomerular filtration rate (eGFR) as an outcome and age, sex, and ethnicity as predictors. In this study with simulated data, we aim to illustrate this problem. Methods We randomly generated unrelated data to estimate eGFR by common equations. Results Using simulated data, we show that age, gender, and ethnicity (recycled predictors variables) are statistically significantly correlated with eGFR in linear regression analysis. Whereas the initial obvious conclusion is that age, sex, and ethnicity are strong predictors of eGFR, more rigorous interpretation suggests that this is a byproduct of the mathematical model produced when deriving new predictors from another. Conclusion While statistical models have the ability to identify vertical collinearity (predictor-predictor), lateral collinearity (predictor-outcome) is seldom identified and discussed in statistical analysis. Therefore, caution is needed when interpreting the correlation between age, gender, and ethnicity with eGFR derived from regression analyses.
id UNSP_087438b7710fe347b1bf1b99831fb793
oai_identifier_str oai:repositorio.unesp.br:11449/198507
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearityBackground One overlooked problem in statistical analysis is lateral collinearity, a phenomenon that may occur when the outcome variable derives from the predictors. In nephrology this issue is seen with the use of estimated glomerular filtration rate (eGFR) as an outcome and age, sex, and ethnicity as predictors. In this study with simulated data, we aim to illustrate this problem. Methods We randomly generated unrelated data to estimate eGFR by common equations. Results Using simulated data, we show that age, gender, and ethnicity (recycled predictors variables) are statistically significantly correlated with eGFR in linear regression analysis. Whereas the initial obvious conclusion is that age, sex, and ethnicity are strong predictors of eGFR, more rigorous interpretation suggests that this is a byproduct of the mathematical model produced when deriving new predictors from another. Conclusion While statistical models have the ability to identify vertical collinearity (predictor-predictor), lateral collinearity (predictor-outcome) is seldom identified and discussed in statistical analysis. Therefore, caution is needed when interpreting the correlation between age, gender, and ethnicity with eGFR derived from regression analyses.Department of Internal Medicine UNESP Univ Estadual PaulistaHospital do Rim Universidade Federal de São PauloUniv Estadual PaulistaUniversidade Federal de São PauloDepartment of Internal Medicine UNESP Univ Estadual PaulistaUniv Estadual PaulistaUniversidade Estadual Paulista (Unesp)Universidade Federal de São Paulo (UNIFESP)de Andrade, Luis Gustavo Modelli [UNESP]Tedesco-Silva, Helio2020-12-12T01:14:44Z2020-12-12T01:14:44Z2020-02-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1371/journal.pone.0228842PLoS ONE, v. 15, n. 2, 2020.1932-6203http://hdl.handle.net/11449/19850710.1371/journal.pone.02288422-s2.0-85079302639Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengPLoS ONEinfo:eu-repo/semantics/openAccess2021-10-22T13:12:57Zoai:repositorio.unesp.br:11449/198507Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462021-10-22T13:12:57Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
title Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
spellingShingle Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
de Andrade, Luis Gustavo Modelli [UNESP]
title_short Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
title_full Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
title_fullStr Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
title_full_unstemmed Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
title_sort Recycling of predictors used to estimate glomerular filtration rate: Insight into lateral collinearity
author de Andrade, Luis Gustavo Modelli [UNESP]
author_facet de Andrade, Luis Gustavo Modelli [UNESP]
Tedesco-Silva, Helio
author_role author
author2 Tedesco-Silva, Helio
author2_role author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (Unesp)
Universidade Federal de São Paulo (UNIFESP)
dc.contributor.author.fl_str_mv de Andrade, Luis Gustavo Modelli [UNESP]
Tedesco-Silva, Helio
description Background One overlooked problem in statistical analysis is lateral collinearity, a phenomenon that may occur when the outcome variable derives from the predictors. In nephrology this issue is seen with the use of estimated glomerular filtration rate (eGFR) as an outcome and age, sex, and ethnicity as predictors. In this study with simulated data, we aim to illustrate this problem. Methods We randomly generated unrelated data to estimate eGFR by common equations. Results Using simulated data, we show that age, gender, and ethnicity (recycled predictors variables) are statistically significantly correlated with eGFR in linear regression analysis. Whereas the initial obvious conclusion is that age, sex, and ethnicity are strong predictors of eGFR, more rigorous interpretation suggests that this is a byproduct of the mathematical model produced when deriving new predictors from another. Conclusion While statistical models have the ability to identify vertical collinearity (predictor-predictor), lateral collinearity (predictor-outcome) is seldom identified and discussed in statistical analysis. Therefore, caution is needed when interpreting the correlation between age, gender, and ethnicity with eGFR derived from regression analyses.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-12T01:14:44Z
2020-12-12T01:14:44Z
2020-02-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1371/journal.pone.0228842
PLoS ONE, v. 15, n. 2, 2020.
1932-6203
http://hdl.handle.net/11449/198507
10.1371/journal.pone.0228842
2-s2.0-85079302639
url http://dx.doi.org/10.1371/journal.pone.0228842
http://hdl.handle.net/11449/198507
identifier_str_mv PLoS ONE, v. 15, n. 2, 2020.
1932-6203
10.1371/journal.pone.0228842
2-s2.0-85079302639
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv PLoS ONE
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1797790357559181312