IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
Autor(a) principal: | |
---|---|
Data de Publicação: | 2023 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Cadernos de Estudos Sociais (Online) |
Texto Completo: | https://periodicos.fundaj.gov.br/CAD/article/view/2122 |
Resumo: | Premature birth is a global problem due to its implications for morbidity and mortality. It is one of the main risk factors for neonatal and infant mortality. Preterm delivery is defined as one whose pregnancy ends between the 20th and 37th weeks or between 140 and 257 days after the first day of the last menstrual period. For this study, data from the Information System on Live Births (SINASC) from the capitals of the Northeast region of Brazil, between 2018 and 2021, were used. of the performance metrics, compared to what was used for training and validation of the models. Six machine learning algorithms (Logistic Regression, Linear Discriminant Analysis, Multilayer Perceptron, AdaBoost, Decision Tree and Random Forest) were applied to predict prematurity. The models showed a drop in the Area Under the Roc Curve (AUC) metric in the years 2020 and 2021 compared to 2018 and 2019, with emphasis on the Adaboost, Random Forest and Decision Tree models, with drops greater than 10% attested by the statistical tests of Kruskal-Wallis and Nemenyi. As causes of the drop in performance of the models, it was identified that the variables month of beginning of prenatal care and age lost adherence in relation to the training base. The models showed good predictive performance, however, the use of tree-based models should be done with caution, since they are more unstable and that covid-19 had an impact on the distribution of the variables age and month of beginning of prenatal care. For training new models, pay attention to the input variables and the period used for training. For already established solutions, consider your retraining. KEYWORDS: Depersonalization. Emotional Exhaustion. Professional Achievement. Predictors. |
id |
FUNDAJ-0_22876a3115e9d67ceb517b4e5f5cd6a7 |
---|---|
oai_identifier_str |
oai:ojs.emnuvens.com.br:article/2122 |
network_acronym_str |
FUNDAJ-0 |
network_name_str |
Cadernos de Estudos Sociais (Online) |
repository_id_str |
|
spelling |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021IMPACTO DE LA PANDEMIA POR COVID-19 Y MODELOS DE APRENDIZAJE DE MÁQUINA PARA PREDICCIÓN DE NACIMIENTOS PREMATUROS EN LAS CAPITALES DE LA REGIÓN NORDESTE DE BRASIL, 2018-2021IMPACTO DA PANDEMIA PELA COVID-19 E MODELOS DE APRENDIZAGEM DE MÁQUINA PARA PREDIÇÃO DE NASCIMENTOS PREMATUROS NAS CAPITAIS DA REGIÃO NORDESTE DO BRASIL, 2018-2021Premature birth is a global problem due to its implications for morbidity and mortality. It is one of the main risk factors for neonatal and infant mortality. Preterm delivery is defined as one whose pregnancy ends between the 20th and 37th weeks or between 140 and 257 days after the first day of the last menstrual period. For this study, data from the Information System on Live Births (SINASC) from the capitals of the Northeast region of Brazil, between 2018 and 2021, were used. of the performance metrics, compared to what was used for training and validation of the models. Six machine learning algorithms (Logistic Regression, Linear Discriminant Analysis, Multilayer Perceptron, AdaBoost, Decision Tree and Random Forest) were applied to predict prematurity. The models showed a drop in the Area Under the Roc Curve (AUC) metric in the years 2020 and 2021 compared to 2018 and 2019, with emphasis on the Adaboost, Random Forest and Decision Tree models, with drops greater than 10% attested by the statistical tests of Kruskal-Wallis and Nemenyi. As causes of the drop in performance of the models, it was identified that the variables month of beginning of prenatal care and age lost adherence in relation to the training base. The models showed good predictive performance, however, the use of tree-based models should be done with caution, since they are more unstable and that covid-19 had an impact on the distribution of the variables age and month of beginning of prenatal care. For training new models, pay attention to the input variables and the period used for training. For already established solutions, consider your retraining. KEYWORDS: Depersonalization. Emotional Exhaustion. Professional Achievement. Predictors.El parto prematuro es un problema mundial por sus implicaciones en la morbimortalidad. Es uno de los principales factores de riesgo de mortalidad neonatal e infantil. El parto prematuro se define como aquel cuyo embarazo finaliza entre las semanas 20 y 37 o entre 140 y 257 días después del primer día de la última menstruación. Para este estudio, se utilizaron datos del Sistema de Información sobre Nacidos Vivos (SINASC) de las capitales de la región Nordeste de Brasil, entre 2018 y 2021. de las métricas de desempeño, en comparación con lo que se utilizó para el entrenamiento y validación de los modelos. . Se aplicaron seis algoritmos de aprendizaje automático (regresión logística, análisis discriminante lineal, perceptrón multicapa, AdaBoost, árbol de decisión y bosque aleatorio) para predecir la prematuridad. Los modelos mostraron una caída en la métrica Area Under the Roc Curve (AUC) en los años 2020 y 2021 en comparación con 2018 y 2019, con énfasis en los modelos Adaboost, Random Forest y Decision Tree, con caídas superiores al 10% atestiguadas por el Pruebas estadísticas de Kruskal-Wallis y Nemenyi. Como causas de la caída en el desempeño de los modelos, se identificó que las variables mes de inicio del prenatal y edad perdieron adherencia en relación a la base de formación. Los modelos mostraron un buen desempeño predictivo, sin embargo, el uso de modelos basados en árboles debe hacerse con precaución, ya que son más inestables y que el covid-19 tuvo impacto en la distribución de las variables edad y mes de inicio del prenatal. Para entrenar nuevos modelos, preste atención a las variables de entrada y al período utilizado para el entrenamiento. Para soluciones ya establecidas, considere su reentrenamiento. PALABRAS CLAVE: Prematuridad. Salud. Inteligencia artificial. Aprendizaje automático. COVID-19.O nascimento prematuro é um problema global devido a suas implicações para a morbidade e mortalidade. Consiste em um dos principais fatores de risco para a mortalidade neonatal e infantil. O parto pré-termo é definido como aquele cuja gestação termina entre a 20ª e a 37ª semanas ou entre 140 e 257 dias após o primeiro dia da última menstruação. Para este estudo, utilizou-se dados do Sistema de Informações sobre Nascidos Vivos (SINASC) das capitais da região Nordeste do Brasil, entre 2018 e 2021. Foi Verificado se os dois primeiros anos da pandemia pela covid-19 trouxeram impactos significativos para as distribuições das métricas de performance, em comparação ao que foi utilizado para treinamento e validação dos modelos. Foram aplicados seis algoritmos de aprendizado de máquina (Regressão Logística, Análise Discriminante Linear, Perceptron Multicamadas, AdaBoost, Árvore de decisão e Floresta Aleatória) para predição de prematuridade. Os modelos apresentaram como resultado queda na métrica Area Under the roc Curve (AUC) nos anos de 2020 e 2021 em relação a 2018 e 2019, com ênfase para os modelos Adaboost, Floresta Aleatória e Árvore de decisão, com quedas superiores a 10% atestadas pelos testes estatísticos de Kruskal-Wallis e Nemenyi. Como causadores da queda de performance dos modelos, foi identificado que as variáveis mês do início do pré-natal e idade perderam aderência em relação à base de treino. Os modelos apresentaram boa performance preditiva, contudo, a utilização de modelos baseados em árvores deve ser feita com cautela, visto que estes são mais instáveis e que a covid-19 trouxe impacto na distribuição das variáveis idade e mês de início de pré-natal. Para treinamento de novos modelos, atenção às variáveis de entrada e ao período utilizado para treinamento. Para soluções já estabelecidas, considerar o seu retreinamento. PALABRAS-CHAVE: Prematuridade. Saúde. Inteligência Artificial. Aprendizado de Máquina. covid-19.Fundação Joaquim Nabuco | Diretoria de Pesquisas Sociais2023-04-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://periodicos.fundaj.gov.br/CAD/article/view/212210.33148/CESv37n1(2022)2122Cadernos de Estudos Sociais; v. 37 n. 1 (2022): A mortalidade materna, fetal e infantil e atuação da vigilância do óbito no contexto da pandemia de COVID-192595-40910102-424810.33148/CESv37n1(2022)fullreponame:Cadernos de Estudos Sociais (Online)instname:Fundação Joaquim Nabuco (FUNDAJ)instacron:FUNDAJporhttps://periodicos.fundaj.gov.br/CAD/article/view/2122/1695Copyright (c) 2023 Autor, concedendo à revista o direito de primeira publicaçãohttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessMatapi, José Maurício Matapi da SilvaVeiga da Costa, Heitor Victorde Paula Neto, Fernando Maciano2023-04-12T14:21:14Zoai:ojs.emnuvens.com.br:article/2122Revistahttps://periodicos.fundaj.gov.br/CADPUBhttps://periodicos.fundaj.gov.br/CAD/oaibeatriz.mesquita@fundaj.gov.br||beatriz.mesquita@fundaj.gov.br2595-40910102-4248opendoar:2023-04-12T14:21:14Cadernos de Estudos Sociais (Online) - Fundação Joaquim Nabuco (FUNDAJ)false |
dc.title.none.fl_str_mv |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 IMPACTO DE LA PANDEMIA POR COVID-19 Y MODELOS DE APRENDIZAJE DE MÁQUINA PARA PREDICCIÓN DE NACIMIENTOS PREMATUROS EN LAS CAPITALES DE LA REGIÓN NORDESTE DE BRASIL, 2018-2021 IMPACTO DA PANDEMIA PELA COVID-19 E MODELOS DE APRENDIZAGEM DE MÁQUINA PARA PREDIÇÃO DE NASCIMENTOS PREMATUROS NAS CAPITAIS DA REGIÃO NORDESTE DO BRASIL, 2018-2021 |
title |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 |
spellingShingle |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 Matapi, José Maurício Matapi da Silva |
title_short |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 |
title_full |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 |
title_fullStr |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 |
title_full_unstemmed |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 |
title_sort |
IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021 |
author |
Matapi, José Maurício Matapi da Silva |
author_facet |
Matapi, José Maurício Matapi da Silva Veiga da Costa, Heitor Victor de Paula Neto, Fernando Maciano |
author_role |
author |
author2 |
Veiga da Costa, Heitor Victor de Paula Neto, Fernando Maciano |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Matapi, José Maurício Matapi da Silva Veiga da Costa, Heitor Victor de Paula Neto, Fernando Maciano |
description |
Premature birth is a global problem due to its implications for morbidity and mortality. It is one of the main risk factors for neonatal and infant mortality. Preterm delivery is defined as one whose pregnancy ends between the 20th and 37th weeks or between 140 and 257 days after the first day of the last menstrual period. For this study, data from the Information System on Live Births (SINASC) from the capitals of the Northeast region of Brazil, between 2018 and 2021, were used. of the performance metrics, compared to what was used for training and validation of the models. Six machine learning algorithms (Logistic Regression, Linear Discriminant Analysis, Multilayer Perceptron, AdaBoost, Decision Tree and Random Forest) were applied to predict prematurity. The models showed a drop in the Area Under the Roc Curve (AUC) metric in the years 2020 and 2021 compared to 2018 and 2019, with emphasis on the Adaboost, Random Forest and Decision Tree models, with drops greater than 10% attested by the statistical tests of Kruskal-Wallis and Nemenyi. As causes of the drop in performance of the models, it was identified that the variables month of beginning of prenatal care and age lost adherence in relation to the training base. The models showed good predictive performance, however, the use of tree-based models should be done with caution, since they are more unstable and that covid-19 had an impact on the distribution of the variables age and month of beginning of prenatal care. For training new models, pay attention to the input variables and the period used for training. For already established solutions, consider your retraining. KEYWORDS: Depersonalization. Emotional Exhaustion. Professional Achievement. Predictors. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-04-11 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://periodicos.fundaj.gov.br/CAD/article/view/2122 10.33148/CESv37n1(2022)2122 |
url |
https://periodicos.fundaj.gov.br/CAD/article/view/2122 |
identifier_str_mv |
10.33148/CESv37n1(2022)2122 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
https://periodicos.fundaj.gov.br/CAD/article/view/2122/1695 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2023 Autor, concedendo à revista o direito de primeira publicação https://creativecommons.org/licenses/by/4.0 info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2023 Autor, concedendo à revista o direito de primeira publicação https://creativecommons.org/licenses/by/4.0 |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Fundação Joaquim Nabuco | Diretoria de Pesquisas Sociais |
publisher.none.fl_str_mv |
Fundação Joaquim Nabuco | Diretoria de Pesquisas Sociais |
dc.source.none.fl_str_mv |
Cadernos de Estudos Sociais; v. 37 n. 1 (2022): A mortalidade materna, fetal e infantil e atuação da vigilância do óbito no contexto da pandemia de COVID-19 2595-4091 0102-4248 10.33148/CESv37n1(2022)full reponame:Cadernos de Estudos Sociais (Online) instname:Fundação Joaquim Nabuco (FUNDAJ) instacron:FUNDAJ |
instname_str |
Fundação Joaquim Nabuco (FUNDAJ) |
instacron_str |
FUNDAJ |
institution |
FUNDAJ |
reponame_str |
Cadernos de Estudos Sociais (Online) |
collection |
Cadernos de Estudos Sociais (Online) |
repository.name.fl_str_mv |
Cadernos de Estudos Sociais (Online) - Fundação Joaquim Nabuco (FUNDAJ) |
repository.mail.fl_str_mv |
beatriz.mesquita@fundaj.gov.br||beatriz.mesquita@fundaj.gov.br |
_version_ |
1798042197081194496 |