Soybean yield prediction by machine learning and climate

Detalhes bibliográficos
Autor(a) principal: Torsoni, Guilherme Botega
Data de Publicação: 2023
Outros Autores: de Oliveira Aparecido, Lucas Eduardo, dos Santos, Gabriela Marins, Chiquitto, Alisson Gaspar, da Silva Cabral Moraes, José Reinaldo, de Souza Rolim, Glauco [UNESP]
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UNESP
Texto Completo: http://dx.doi.org/10.1007/s00704-022-04341-9
http://hdl.handle.net/11449/246606
Resumo: Soybean cultivation plays an important role in Mato Grosso do Sul and around the world. Given the inherent complexity of the agricultural system, this study aimed to develop climate-based yield prediction models using ML, considering the most correlated meteorological variables for each condition, test the best model with independent data, and define zones of higher soybean yield in Mato Grosso do Sul to recommend better planting sites. The study was carried out in two stages. First, meteorological and soybean yield data obtained from 47 locations in the state of Mato Grosso do Sul were used to calibrate the machine learning (ML) algorithms. Second, the best algorithm was used to predict soybean yields throughout Mato Grosso do Sul. Daily meteorological data of air temperature (T, °C), precipitation (P, mm), global solar irradiance (Qg, MJ m−2 day−1), wind speed (u2, m s−1), net radiation (Rn, MJ m−2 day−1), and relative humidity (RH, %) of the NASA-POWER system from 2002 to 2021 were used. The reference evapotranspiration (ETo) by the standard FAO method and water balance (WB) by Thornthwaite and Mather (1955) were calculated for each collection point. The MLs used in this stage consisted of multiple linear regression (MLR), multilayer perceptron (MLP), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBOOSTING), and gradient boosted decision (GradBOOSTING). The ML models were calibrated using 70% of the data selected for training and 30% for validation. Algorithms were evaluated by accuracy, precision, and tendency. All analyses were performed using Python 3.8 software. Climate variables showed high spatial and seasonal variability throughout Mato Grosso do Sul (MS). Pearson’s univariate correlations between soybean yield and climate variables of the phenological period showed distinct relationships and different intensities. For instance, soil water storage (ARM) showed negative, neutral, and positive correlations in October, November, and December, respectively. The calibrated ML algorithms had a high precision and accuracy in both calibration and testing. For instance, the best model in the calibration was XGBOOSTING, which showed MAPE, R2, RMSE, MSE, and MAE values of 1.84%, 0.95, 2.06%, 4.24%, and 0.921%, respectively. Random forest (RF), extreme gradient boosting (XGBOOSTING), and gradient boosting (GradBOOSTING) were the most precise machine learning algorithms, with R2 values of 0.71, 0.62, and 0.62 in the test, respectively.
id UNSP_5bdfef65ae7369188e9efaa5f605deb1
oai_identifier_str oai:repositorio.unesp.br:11449/246606
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling Soybean yield prediction by machine learning and climateSoybean cultivation plays an important role in Mato Grosso do Sul and around the world. Given the inherent complexity of the agricultural system, this study aimed to develop climate-based yield prediction models using ML, considering the most correlated meteorological variables for each condition, test the best model with independent data, and define zones of higher soybean yield in Mato Grosso do Sul to recommend better planting sites. The study was carried out in two stages. First, meteorological and soybean yield data obtained from 47 locations in the state of Mato Grosso do Sul were used to calibrate the machine learning (ML) algorithms. Second, the best algorithm was used to predict soybean yields throughout Mato Grosso do Sul. Daily meteorological data of air temperature (T, °C), precipitation (P, mm), global solar irradiance (Qg, MJ m−2 day−1), wind speed (u2, m s−1), net radiation (Rn, MJ m−2 day−1), and relative humidity (RH, %) of the NASA-POWER system from 2002 to 2021 were used. The reference evapotranspiration (ETo) by the standard FAO method and water balance (WB) by Thornthwaite and Mather (1955) were calculated for each collection point. The MLs used in this stage consisted of multiple linear regression (MLR), multilayer perceptron (MLP), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBOOSTING), and gradient boosted decision (GradBOOSTING). The ML models were calibrated using 70% of the data selected for training and 30% for validation. Algorithms were evaluated by accuracy, precision, and tendency. All analyses were performed using Python 3.8 software. Climate variables showed high spatial and seasonal variability throughout Mato Grosso do Sul (MS). Pearson’s univariate correlations between soybean yield and climate variables of the phenological period showed distinct relationships and different intensities. For instance, soil water storage (ARM) showed negative, neutral, and positive correlations in October, November, and December, respectively. The calibrated ML algorithms had a high precision and accuracy in both calibration and testing. For instance, the best model in the calibration was XGBOOSTING, which showed MAPE, R2, RMSE, MSE, and MAE values of 1.84%, 0.95, 2.06%, 4.24%, and 0.921%, respectively. Random forest (RF), extreme gradient boosting (XGBOOSTING), and gradient boosting (GradBOOSTING) were the most precise machine learning algorithms, with R2 values of 0.71, 0.62, and 0.62 in the test, respectively.Instituto Federal de Educação Ciência E Tecnologia de Mato Grosso Do Sul IFMS, Campus de NaviraíInstituto Federal do Sul de Minas Gerais (IFSULDEMINAS), Campus MuzambinhoUniversidade Estadual de São Paulo (FCAV/UNESP), JaboticabalUniversidade Estadual de São Paulo (FCAV/UNESP), JaboticabalIFMSInstituto Federal do Sul de Minas Gerais (IFSULDEMINAS)Universidade Estadual Paulista (UNESP)Torsoni, Guilherme Botegade Oliveira Aparecido, Lucas Eduardodos Santos, Gabriela MarinsChiquitto, Alisson Gasparda Silva Cabral Moraes, José Reinaldode Souza Rolim, Glauco [UNESP]2023-07-29T12:45:33Z2023-07-29T12:45:33Z2023-02-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article1709-1725http://dx.doi.org/10.1007/s00704-022-04341-9Theoretical and Applied Climatology, v. 151, n. 3-4, p. 1709-1725, 2023.1434-44830177-798Xhttp://hdl.handle.net/11449/24660610.1007/s00704-022-04341-92-s2.0-85145751001Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengTheoretical and Applied Climatologyinfo:eu-repo/semantics/openAccess2023-07-29T12:45:33Zoai:repositorio.unesp.br:11449/246606Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T19:26:05.173676Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Soybean yield prediction by machine learning and climate
title Soybean yield prediction by machine learning and climate
spellingShingle Soybean yield prediction by machine learning and climate
Torsoni, Guilherme Botega
title_short Soybean yield prediction by machine learning and climate
title_full Soybean yield prediction by machine learning and climate
title_fullStr Soybean yield prediction by machine learning and climate
title_full_unstemmed Soybean yield prediction by machine learning and climate
title_sort Soybean yield prediction by machine learning and climate
author Torsoni, Guilherme Botega
author_facet Torsoni, Guilherme Botega
de Oliveira Aparecido, Lucas Eduardo
dos Santos, Gabriela Marins
Chiquitto, Alisson Gaspar
da Silva Cabral Moraes, José Reinaldo
de Souza Rolim, Glauco [UNESP]
author_role author
author2 de Oliveira Aparecido, Lucas Eduardo
dos Santos, Gabriela Marins
Chiquitto, Alisson Gaspar
da Silva Cabral Moraes, José Reinaldo
de Souza Rolim, Glauco [UNESP]
author2_role author
author
author
author
author
dc.contributor.none.fl_str_mv IFMS
Instituto Federal do Sul de Minas Gerais (IFSULDEMINAS)
Universidade Estadual Paulista (UNESP)
dc.contributor.author.fl_str_mv Torsoni, Guilherme Botega
de Oliveira Aparecido, Lucas Eduardo
dos Santos, Gabriela Marins
Chiquitto, Alisson Gaspar
da Silva Cabral Moraes, José Reinaldo
de Souza Rolim, Glauco [UNESP]
description Soybean cultivation plays an important role in Mato Grosso do Sul and around the world. Given the inherent complexity of the agricultural system, this study aimed to develop climate-based yield prediction models using ML, considering the most correlated meteorological variables for each condition, test the best model with independent data, and define zones of higher soybean yield in Mato Grosso do Sul to recommend better planting sites. The study was carried out in two stages. First, meteorological and soybean yield data obtained from 47 locations in the state of Mato Grosso do Sul were used to calibrate the machine learning (ML) algorithms. Second, the best algorithm was used to predict soybean yields throughout Mato Grosso do Sul. Daily meteorological data of air temperature (T, °C), precipitation (P, mm), global solar irradiance (Qg, MJ m−2 day−1), wind speed (u2, m s−1), net radiation (Rn, MJ m−2 day−1), and relative humidity (RH, %) of the NASA-POWER system from 2002 to 2021 were used. The reference evapotranspiration (ETo) by the standard FAO method and water balance (WB) by Thornthwaite and Mather (1955) were calculated for each collection point. The MLs used in this stage consisted of multiple linear regression (MLR), multilayer perceptron (MLP), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBOOSTING), and gradient boosted decision (GradBOOSTING). The ML models were calibrated using 70% of the data selected for training and 30% for validation. Algorithms were evaluated by accuracy, precision, and tendency. All analyses were performed using Python 3.8 software. Climate variables showed high spatial and seasonal variability throughout Mato Grosso do Sul (MS). Pearson’s univariate correlations between soybean yield and climate variables of the phenological period showed distinct relationships and different intensities. For instance, soil water storage (ARM) showed negative, neutral, and positive correlations in October, November, and December, respectively. The calibrated ML algorithms had a high precision and accuracy in both calibration and testing. For instance, the best model in the calibration was XGBOOSTING, which showed MAPE, R2, RMSE, MSE, and MAE values of 1.84%, 0.95, 2.06%, 4.24%, and 0.921%, respectively. Random forest (RF), extreme gradient boosting (XGBOOSTING), and gradient boosting (GradBOOSTING) were the most precise machine learning algorithms, with R2 values of 0.71, 0.62, and 0.62 in the test, respectively.
publishDate 2023
dc.date.none.fl_str_mv 2023-07-29T12:45:33Z
2023-07-29T12:45:33Z
2023-02-01
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1007/s00704-022-04341-9
Theoretical and Applied Climatology, v. 151, n. 3-4, p. 1709-1725, 2023.
1434-4483
0177-798X
http://hdl.handle.net/11449/246606
10.1007/s00704-022-04341-9
2-s2.0-85145751001
url http://dx.doi.org/10.1007/s00704-022-04341-9
http://hdl.handle.net/11449/246606
identifier_str_mv Theoretical and Applied Climatology, v. 151, n. 3-4, p. 1709-1725, 2023.
1434-4483
0177-798X
10.1007/s00704-022-04341-9
2-s2.0-85145751001
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Theoretical and Applied Climatology
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 1709-1725
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv
_version_ 1808129067935858688