Preprocessing procedures and supervised classification applied to a database of systematic soil survey

Detalhes bibliográficos
Autor(a) principal: Valadares,Alan Pessoa
Data de Publicação: 2019
Outros Autores: Coelho,Ricardo Marques, Oliveira,Stanley Robson de Medeiros
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Scientia Agrícola (Online)
Texto Completo: http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162019001500439
Resumo: ABSTRACT: Data Mining techniques play an important role in the prediction of soil spatial distribution in systematic soil surveying, though existing methodologies still lack standardization and a full understanding of their capabilities. The aim of this work was to evaluate the performance of preprocessing procedures and supervised classification approaches for predicting map units from 1:100,000-scale conventional semi-detailed soil surveys. Sheets of the Brazilian National Cartographic System on the 1:50,000 scale, “Dois Córregos” (“Brotas” 1:100,000-scale sheet), “São Pedro” and “Laras” (“Piracicaba” 1:100,000-scale sheet) were used for developing models. Soil map information and predictive environmental covariates for the dataset were obtained from the semi-detailed soil survey of the state of São Paulo, from the Brazilian Institute of Geography and Statistics (IBGE) 1:50,000-scale topographic sheets and from the 1:750,000-scale geological map of the state of São Paulo. The target variable was a soil map unit of four types: local “soil unit” name and soil class at three hierarchical levels of the Brazilian System of Soil Classification (SiBCS). Different data preprocessing treatments and four algorithms all having different approaches were also tested. Results showed that composite soil map units were not adequate for the machine learning process. Class balance did not contribute to improving the performance of classifiers. Accuracy values of 78 % and a Kappa index of 0.67 were obtained after preprocessing procedures with Random Forest, the algorithm that performed best. Information from conventional map units of semi-detailed (4th order) 1:100,000 soil survey generated models with values for accuracy, precision, sensitivity, specificity and Kappa indexes that support their use in programs for systematic soil surveying.
id USP-18_40a82020c1ee62ba97049a80f3c85585
oai_identifier_str oai:scielo:S0103-90162019001500439
network_acronym_str USP-18
network_name_str Scientia Agrícola (Online)
repository_id_str
spelling Preprocessing procedures and supervised classification applied to a database of systematic soil surveymachine learning algorithmsrandom foresttacit soil-landscape relationshipsdigital soil mappingABSTRACT: Data Mining techniques play an important role in the prediction of soil spatial distribution in systematic soil surveying, though existing methodologies still lack standardization and a full understanding of their capabilities. The aim of this work was to evaluate the performance of preprocessing procedures and supervised classification approaches for predicting map units from 1:100,000-scale conventional semi-detailed soil surveys. Sheets of the Brazilian National Cartographic System on the 1:50,000 scale, “Dois Córregos” (“Brotas” 1:100,000-scale sheet), “São Pedro” and “Laras” (“Piracicaba” 1:100,000-scale sheet) were used for developing models. Soil map information and predictive environmental covariates for the dataset were obtained from the semi-detailed soil survey of the state of São Paulo, from the Brazilian Institute of Geography and Statistics (IBGE) 1:50,000-scale topographic sheets and from the 1:750,000-scale geological map of the state of São Paulo. The target variable was a soil map unit of four types: local “soil unit” name and soil class at three hierarchical levels of the Brazilian System of Soil Classification (SiBCS). Different data preprocessing treatments and four algorithms all having different approaches were also tested. Results showed that composite soil map units were not adequate for the machine learning process. Class balance did not contribute to improving the performance of classifiers. Accuracy values of 78 % and a Kappa index of 0.67 were obtained after preprocessing procedures with Random Forest, the algorithm that performed best. Information from conventional map units of semi-detailed (4th order) 1:100,000 soil survey generated models with values for accuracy, precision, sensitivity, specificity and Kappa indexes that support their use in programs for systematic soil surveying.Escola Superior de Agricultura "Luiz de Queiroz"2019-10-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/htmlhttp://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162019001500439Scientia Agricola v.76 n.5 2019reponame:Scientia Agrícola (Online)instname:Universidade de São Paulo (USP)instacron:USP10.1590/1678-992x-2017-0171info:eu-repo/semantics/openAccessValadares,Alan PessoaCoelho,Ricardo MarquesOliveira,Stanley Robson de Medeiroseng2019-05-21T00:00:00Zoai:scielo:S0103-90162019001500439Revistahttp://revistas.usp.br/sa/indexPUBhttps://old.scielo.br/oai/scielo-oai.phpscientia@usp.br||alleoni@usp.br1678-992X0103-9016opendoar:2019-05-21T00:00Scientia Agrícola (Online) - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Preprocessing procedures and supervised classification applied to a database of systematic soil survey
title Preprocessing procedures and supervised classification applied to a database of systematic soil survey
spellingShingle Preprocessing procedures and supervised classification applied to a database of systematic soil survey
Valadares,Alan Pessoa
machine learning algorithms
random forest
tacit soil-landscape relationships
digital soil mapping
title_short Preprocessing procedures and supervised classification applied to a database of systematic soil survey
title_full Preprocessing procedures and supervised classification applied to a database of systematic soil survey
title_fullStr Preprocessing procedures and supervised classification applied to a database of systematic soil survey
title_full_unstemmed Preprocessing procedures and supervised classification applied to a database of systematic soil survey
title_sort Preprocessing procedures and supervised classification applied to a database of systematic soil survey
author Valadares,Alan Pessoa
author_facet Valadares,Alan Pessoa
Coelho,Ricardo Marques
Oliveira,Stanley Robson de Medeiros
author_role author
author2 Coelho,Ricardo Marques
Oliveira,Stanley Robson de Medeiros
author2_role author
author
dc.contributor.author.fl_str_mv Valadares,Alan Pessoa
Coelho,Ricardo Marques
Oliveira,Stanley Robson de Medeiros
dc.subject.por.fl_str_mv machine learning algorithms
random forest
tacit soil-landscape relationships
digital soil mapping
topic machine learning algorithms
random forest
tacit soil-landscape relationships
digital soil mapping
description ABSTRACT: Data Mining techniques play an important role in the prediction of soil spatial distribution in systematic soil surveying, though existing methodologies still lack standardization and a full understanding of their capabilities. The aim of this work was to evaluate the performance of preprocessing procedures and supervised classification approaches for predicting map units from 1:100,000-scale conventional semi-detailed soil surveys. Sheets of the Brazilian National Cartographic System on the 1:50,000 scale, “Dois Córregos” (“Brotas” 1:100,000-scale sheet), “São Pedro” and “Laras” (“Piracicaba” 1:100,000-scale sheet) were used for developing models. Soil map information and predictive environmental covariates for the dataset were obtained from the semi-detailed soil survey of the state of São Paulo, from the Brazilian Institute of Geography and Statistics (IBGE) 1:50,000-scale topographic sheets and from the 1:750,000-scale geological map of the state of São Paulo. The target variable was a soil map unit of four types: local “soil unit” name and soil class at three hierarchical levels of the Brazilian System of Soil Classification (SiBCS). Different data preprocessing treatments and four algorithms all having different approaches were also tested. Results showed that composite soil map units were not adequate for the machine learning process. Class balance did not contribute to improving the performance of classifiers. Accuracy values of 78 % and a Kappa index of 0.67 were obtained after preprocessing procedures with Random Forest, the algorithm that performed best. Information from conventional map units of semi-detailed (4th order) 1:100,000 soil survey generated models with values for accuracy, precision, sensitivity, specificity and Kappa indexes that support their use in programs for systematic soil surveying.
publishDate 2019
dc.date.none.fl_str_mv 2019-10-01
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162019001500439
url http://old.scielo.br/scielo.php?script=sci_arttext&pid=S0103-90162019001500439
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1590/1678-992x-2017-0171
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv text/html
dc.publisher.none.fl_str_mv Escola Superior de Agricultura "Luiz de Queiroz"
publisher.none.fl_str_mv Escola Superior de Agricultura "Luiz de Queiroz"
dc.source.none.fl_str_mv Scientia Agricola v.76 n.5 2019
reponame:Scientia Agrícola (Online)
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Scientia Agrícola (Online)
collection Scientia Agrícola (Online)
repository.name.fl_str_mv Scientia Agrícola (Online) - Universidade de São Paulo (USP)
repository.mail.fl_str_mv scientia@usp.br||alleoni@usp.br
_version_ 1748936465155358720