IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021

Detalhes bibliográficos
Autor(a) principal: Matapi, José Maurício Matapi da Silva
Data de Publicação: 2023
Outros Autores: Veiga da Costa, Heitor Victor, de Paula Neto, Fernando Maciano
Tipo de documento: Artigo
Idioma: por
Título da fonte: Cadernos de Estudos Sociais (Online)
Texto Completo: https://periodicos.fundaj.gov.br/CAD/article/view/2122
Resumo: Premature birth is a global problem due to its implications for morbidity and mortality. It is one of the main risk factors for neonatal and infant mortality. Preterm delivery is defined as one whose pregnancy ends between the 20th and 37th weeks or between 140 and 257 days after the first day of the last menstrual period. For this study, data from the Information System on Live Births (SINASC) from the capitals of the Northeast region of Brazil, between 2018 and 2021, were used. of the performance metrics, compared to what was used for training and validation of the models. Six machine learning algorithms (Logistic Regression, Linear Discriminant Analysis, Multilayer Perceptron, AdaBoost, Decision Tree and Random Forest) were applied to predict prematurity. The models showed a drop in the Area Under the Roc Curve (AUC) metric in the years 2020 and 2021 compared to 2018 and 2019, with emphasis on the Adaboost, Random Forest and Decision Tree models, with drops greater than 10% attested by the statistical tests of Kruskal-Wallis and Nemenyi. As causes of the drop in performance of the models, it was identified that the variables month of beginning of prenatal care and age lost adherence in relation to the training base. The models showed good predictive performance, however, the use of tree-based models should be done with caution, since they are more unstable and that covid-19 had an impact on the distribution of the variables age and month of beginning of prenatal care. For training new models, pay attention to the input variables and the period used for training. For already established solutions, consider your retraining. KEYWORDS: Depersonalization. Emotional Exhaustion. Professional Achievement. Predictors.
id FUNDAJ-0_22876a3115e9d67ceb517b4e5f5cd6a7
oai_identifier_str oai:ojs.emnuvens.com.br:article/2122
network_acronym_str FUNDAJ-0
network_name_str Cadernos de Estudos Sociais (Online)
repository_id_str
spelling IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021IMPACTO DE LA PANDEMIA POR COVID-19 Y MODELOS DE APRENDIZAJE DE MÁQUINA PARA PREDICCIÓN DE NACIMIENTOS PREMATUROS EN LAS CAPITALES DE LA REGIÓN NORDESTE DE BRASIL, 2018-2021IMPACTO DA PANDEMIA PELA COVID-19 E MODELOS DE APRENDIZAGEM DE MÁQUINA PARA PREDIÇÃO DE NASCIMENTOS PREMATUROS NAS CAPITAIS DA REGIÃO NORDESTE DO BRASIL, 2018-2021Premature birth is a global problem due to its implications for morbidity and mortality. It is one of the main risk factors for neonatal and infant mortality. Preterm delivery is defined as one whose pregnancy ends between the 20th and 37th weeks or between 140 and 257 days after the first day of the last menstrual period. For this study, data from the Information System on Live Births (SINASC) from the capitals of the Northeast region of Brazil, between 2018 and 2021, were used. of the performance metrics, compared to what was used for training and validation of the models. Six machine learning algorithms (Logistic Regression, Linear Discriminant Analysis, Multilayer Perceptron, AdaBoost, Decision Tree and Random Forest) were applied to predict prematurity. The models showed a drop in the Area Under the Roc Curve (AUC) metric in the years 2020 and 2021 compared to 2018 and 2019, with emphasis on the Adaboost, Random Forest and Decision Tree models, with drops greater than 10% attested by the statistical tests of Kruskal-Wallis and Nemenyi. As causes of the drop in performance of the models, it was identified that the variables month of beginning of prenatal care and age lost adherence in relation to the training base. The models showed good predictive performance, however, the use of tree-based models should be done with caution, since they are more unstable and that covid-19 had an impact on the distribution of the variables age and month of beginning of prenatal care. For training new models, pay attention to the input variables and the period used for training. For already established solutions, consider your retraining. KEYWORDS: Depersonalization. Emotional Exhaustion. Professional Achievement. Predictors.El parto prematuro es un problema mundial por sus implicaciones en la morbimortalidad. Es uno de los principales factores de riesgo de mortalidad neonatal e infantil. El parto prematuro se define como aquel cuyo embarazo finaliza entre las semanas 20 y 37 o entre 140 y 257 días después del primer día de la última menstruación. Para este estudio, se utilizaron datos del Sistema de Información sobre Nacidos Vivos (SINASC) de las capitales de la región Nordeste de Brasil, entre 2018 y 2021. de las métricas de desempeño, en comparación con lo que se utilizó para el entrenamiento y validación de los modelos. . Se aplicaron seis algoritmos de aprendizaje automático (regresión logística, análisis discriminante lineal, perceptrón multicapa, AdaBoost, árbol de decisión y bosque aleatorio) para predecir la prematuridad. Los modelos mostraron una caída en la métrica Area Under the Roc Curve (AUC) en los años 2020 y 2021 en comparación con 2018 y 2019, con énfasis en los modelos Adaboost, Random Forest y Decision Tree, con caídas superiores al 10% atestiguadas por el Pruebas estadísticas de Kruskal-Wallis y Nemenyi. Como causas de la caída en el desempeño de los modelos, se identificó que las variables mes de inicio del prenatal y edad perdieron adherencia en relación a la base de formación. Los modelos mostraron un buen desempeño predictivo, sin embargo, el uso de modelos basados ​​en árboles debe hacerse con precaución, ya que son más inestables y que el covid-19 tuvo impacto en la distribución de las variables edad y mes de inicio del prenatal. Para entrenar nuevos modelos, preste atención a las variables de entrada y al período utilizado para el entrenamiento. Para soluciones ya establecidas, considere su reentrenamiento. PALABRAS CLAVE: Prematuridad. Salud. Inteligencia artificial. Aprendizaje automático. COVID-19.O nascimento prematuro é um problema global devido a suas implicações para a morbidade e mortalidade. Consiste em um dos principais fatores de risco para a mortalidade neonatal e infantil. O parto pré-termo é definido como aquele cuja gestação termina entre a 20ª e a 37ª semanas ou entre 140 e 257 dias após o primeiro dia da última menstruação. Para este estudo, utilizou-se dados do Sistema de Informações sobre Nascidos Vivos (SINASC) das capitais da região Nordeste do Brasil, entre 2018 e 2021. Foi Verificado  se os dois primeiros anos da pandemia pela covid-19 trouxeram impactos significativos para as distribuições das métricas de performance, em comparação ao que foi utilizado para treinamento e validação dos modelos. Foram aplicados seis algoritmos de aprendizado de máquina (Regressão Logística, Análise Discriminante Linear, Perceptron Multicamadas, AdaBoost, Árvore de decisão e Floresta Aleatória) para predição de prematuridade. Os modelos apresentaram como resultado queda na métrica Area Under the roc Curve (AUC) nos anos de 2020 e 2021 em relação a 2018 e 2019, com ênfase para os modelos Adaboost, Floresta Aleatória e Árvore de decisão, com quedas superiores a 10% atestadas pelos testes estatísticos de Kruskal-Wallis e Nemenyi. Como causadores da queda de performance dos modelos, foi identificado que as variáveis mês do início do pré-natal e idade perderam aderência em relação à base de treino. Os modelos apresentaram boa performance preditiva, contudo, a utilização de modelos baseados em árvores deve ser feita com cautela, visto que estes são mais instáveis e que a covid-19 trouxe impacto na distribuição das variáveis idade e mês de início de pré-natal. Para treinamento de novos modelos, atenção às variáveis de entrada e ao período utilizado para treinamento. Para soluções já estabelecidas, considerar o seu retreinamento. PALABRAS-CHAVE: Prematuridade. Saúde. Inteligência Artificial. Aprendizado de Máquina. covid-19.Fundação Joaquim Nabuco | Diretoria de Pesquisas Sociais2023-04-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://periodicos.fundaj.gov.br/CAD/article/view/212210.33148/CESv37n1(2022)2122Cadernos de Estudos Sociais; v. 37 n. 1 (2022): A mortalidade materna, fetal e infantil e atuação da vigilância do óbito no contexto da pandemia de COVID-192595-40910102-424810.33148/CESv37n1(2022)fullreponame:Cadernos de Estudos Sociais (Online)instname:Fundação Joaquim Nabuco (FUNDAJ)instacron:FUNDAJporhttps://periodicos.fundaj.gov.br/CAD/article/view/2122/1695Copyright (c) 2023 Autor, concedendo à revista o direito de primeira publicaçãohttps://creativecommons.org/licenses/by/4.0info:eu-repo/semantics/openAccessMatapi, José Maurício Matapi da SilvaVeiga da Costa, Heitor Victorde Paula Neto, Fernando Maciano2023-04-12T14:21:14Zoai:ojs.emnuvens.com.br:article/2122Revistahttps://periodicos.fundaj.gov.br/CADPUBhttps://periodicos.fundaj.gov.br/CAD/oaibeatriz.mesquita@fundaj.gov.br||beatriz.mesquita@fundaj.gov.br2595-40910102-4248opendoar:2023-04-12T14:21:14Cadernos de Estudos Sociais (Online) - Fundação Joaquim Nabuco (FUNDAJ)false
dc.title.none.fl_str_mv IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
IMPACTO DE LA PANDEMIA POR COVID-19 Y MODELOS DE APRENDIZAJE DE MÁQUINA PARA PREDICCIÓN DE NACIMIENTOS PREMATUROS EN LAS CAPITALES DE LA REGIÓN NORDESTE DE BRASIL, 2018-2021
IMPACTO DA PANDEMIA PELA COVID-19 E MODELOS DE APRENDIZAGEM DE MÁQUINA PARA PREDIÇÃO DE NASCIMENTOS PREMATUROS NAS CAPITAIS DA REGIÃO NORDESTE DO BRASIL, 2018-2021
title IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
spellingShingle IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
Matapi, José Maurício Matapi da Silva
title_short IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
title_full IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
title_fullStr IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
title_full_unstemmed IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
title_sort IMPACT OF THE COVID-19 PANDEMIC AND MACHINE LEARNING MODELS FOR PREDICTING PREMATURE BIRTHS IN THE CAPITALS OF THE NORTHEAST REGION OF BRAZIL, 2018-2021
author Matapi, José Maurício Matapi da Silva
author_facet Matapi, José Maurício Matapi da Silva
Veiga da Costa, Heitor Victor
de Paula Neto, Fernando Maciano
author_role author
author2 Veiga da Costa, Heitor Victor
de Paula Neto, Fernando Maciano
author2_role author
author
dc.contributor.author.fl_str_mv Matapi, José Maurício Matapi da Silva
Veiga da Costa, Heitor Victor
de Paula Neto, Fernando Maciano
description Premature birth is a global problem due to its implications for morbidity and mortality. It is one of the main risk factors for neonatal and infant mortality. Preterm delivery is defined as one whose pregnancy ends between the 20th and 37th weeks or between 140 and 257 days after the first day of the last menstrual period. For this study, data from the Information System on Live Births (SINASC) from the capitals of the Northeast region of Brazil, between 2018 and 2021, were used. of the performance metrics, compared to what was used for training and validation of the models. Six machine learning algorithms (Logistic Regression, Linear Discriminant Analysis, Multilayer Perceptron, AdaBoost, Decision Tree and Random Forest) were applied to predict prematurity. The models showed a drop in the Area Under the Roc Curve (AUC) metric in the years 2020 and 2021 compared to 2018 and 2019, with emphasis on the Adaboost, Random Forest and Decision Tree models, with drops greater than 10% attested by the statistical tests of Kruskal-Wallis and Nemenyi. As causes of the drop in performance of the models, it was identified that the variables month of beginning of prenatal care and age lost adherence in relation to the training base. The models showed good predictive performance, however, the use of tree-based models should be done with caution, since they are more unstable and that covid-19 had an impact on the distribution of the variables age and month of beginning of prenatal care. For training new models, pay attention to the input variables and the period used for training. For already established solutions, consider your retraining. KEYWORDS: Depersonalization. Emotional Exhaustion. Professional Achievement. Predictors.
publishDate 2023
dc.date.none.fl_str_mv 2023-04-11
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://periodicos.fundaj.gov.br/CAD/article/view/2122
10.33148/CESv37n1(2022)2122
url https://periodicos.fundaj.gov.br/CAD/article/view/2122
identifier_str_mv 10.33148/CESv37n1(2022)2122
dc.language.iso.fl_str_mv por
language por
dc.relation.none.fl_str_mv https://periodicos.fundaj.gov.br/CAD/article/view/2122/1695
dc.rights.driver.fl_str_mv Copyright (c) 2023 Autor, concedendo à revista o direito de primeira publicação
https://creativecommons.org/licenses/by/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2023 Autor, concedendo à revista o direito de primeira publicação
https://creativecommons.org/licenses/by/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Fundação Joaquim Nabuco | Diretoria de Pesquisas Sociais
publisher.none.fl_str_mv Fundação Joaquim Nabuco | Diretoria de Pesquisas Sociais
dc.source.none.fl_str_mv Cadernos de Estudos Sociais; v. 37 n. 1 (2022): A mortalidade materna, fetal e infantil e atuação da vigilância do óbito no contexto da pandemia de COVID-19
2595-4091
0102-4248
10.33148/CESv37n1(2022)full
reponame:Cadernos de Estudos Sociais (Online)
instname:Fundação Joaquim Nabuco (FUNDAJ)
instacron:FUNDAJ
instname_str Fundação Joaquim Nabuco (FUNDAJ)
instacron_str FUNDAJ
institution FUNDAJ
reponame_str Cadernos de Estudos Sociais (Online)
collection Cadernos de Estudos Sociais (Online)
repository.name.fl_str_mv Cadernos de Estudos Sociais (Online) - Fundação Joaquim Nabuco (FUNDAJ)
repository.mail.fl_str_mv beatriz.mesquita@fundaj.gov.br||beatriz.mesquita@fundaj.gov.br
_version_ 1798042197081194496