Improving imbalanced land cover classification with k-means smote

Detalhes bibliográficos
Autor(a) principal: Fonseca, Joao
Data de Publicação: 2021
Outros Autores: Douzas, Georgios, Bacao, Fernando
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/121176
Resumo: Fonseca, J., Douzas, G., & Bacao, F. (2021). Improving imbalanced land cover classification with k-means smote: Detecting and oversampling distinctive minority spectral signatures. Information (Switzerland), 12(7), 1-20. [266]. https://doi.org/10.3390/info12070266
id RCAP_f99c591312bd914513c87c62ac5d8eaa
oai_identifier_str oai:run.unl.pt:10362/121176
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Improving imbalanced land cover classification with k-means smoteDetecting and oversampling distinctive minority spectral signaturesClusteringData augmentationImbalanced learningLULC classificationOversamplingInformation SystemsSDG 15 - Life on LandFonseca, J., Douzas, G., & Bacao, F. (2021). Improving imbalanced land cover classification with k-means smote: Detecting and oversampling distinctive minority spectral signatures. Information (Switzerland), 12(7), 1-20. [266]. https://doi.org/10.3390/info12070266Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Land Use/Land Cover maps has been a topic of interest for the remote sensing community for several years, but it is still fraught with technical challenges. One such challenge is the imbalanced nature of most remotely sensed data. The asymmetric class distribution impacts negatively the performance of classifiers and adds a new source of error to the production of these maps. In this paper, we address the imbalanced learning problem, by using K-means and the Synthetic Minority Oversampling Technique (SMOTE) as an improved oversampling algorithm. K-means SMOTE improves the quality of newly created artificial data by addressing both the between-class imbalance, as traditional oversamplers do, but also the within-class imbalance, avoiding the generation of noisy data while effectively overcoming data imbalance. The performance of K-means SMOTE is compared to three popular oversampling methods (Random Oversampling, SMOTE and Borderline-SMOTE) using seven remote sensing benchmark datasets, three classifiers (Logistic Regression, K-Nearest Neighbors and Random Forest Classifier) and three evaluation metrics using a five-fold cross-validation approach with three different initialization seeds. The statistical analysis of the results show that the proposed method consistently outperforms the remaining oversamplers producing higher quality land cover classifications. These results suggest that LULC data can benefit significantly from the use of more sophisticated oversamplers as spectral signatures for the same class can vary according to geographical distribution.NOVA Information Management School (NOVA IMS)Information Management Research Center (MagIC) - NOVA Information Management SchoolRUNFonseca, JoaoDouzas, GeorgiosBacao, Fernando2021-07-16T22:21:57Z2021-07-012021-07-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article20application/pdfhttp://hdl.handle.net/10362/121176eng2078-2489PURE: 32603681https://doi.org/10.3390/info12070266info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:03:30Zoai:run.unl.pt:10362/121176Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:44:32.896976Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Improving imbalanced land cover classification with k-means smote
Detecting and oversampling distinctive minority spectral signatures
title Improving imbalanced land cover classification with k-means smote
spellingShingle Improving imbalanced land cover classification with k-means smote
Fonseca, Joao
Clustering
Data augmentation
Imbalanced learning
LULC classification
Oversampling
Information Systems
SDG 15 - Life on Land
title_short Improving imbalanced land cover classification with k-means smote
title_full Improving imbalanced land cover classification with k-means smote
title_fullStr Improving imbalanced land cover classification with k-means smote
title_full_unstemmed Improving imbalanced land cover classification with k-means smote
title_sort Improving imbalanced land cover classification with k-means smote
author Fonseca, Joao
author_facet Fonseca, Joao
Douzas, Georgios
Bacao, Fernando
author_role author
author2 Douzas, Georgios
Bacao, Fernando
author2_role author
author
dc.contributor.none.fl_str_mv NOVA Information Management School (NOVA IMS)
Information Management Research Center (MagIC) - NOVA Information Management School
RUN
dc.contributor.author.fl_str_mv Fonseca, Joao
Douzas, Georgios
Bacao, Fernando
dc.subject.por.fl_str_mv Clustering
Data augmentation
Imbalanced learning
LULC classification
Oversampling
Information Systems
SDG 15 - Life on Land
topic Clustering
Data augmentation
Imbalanced learning
LULC classification
Oversampling
Information Systems
SDG 15 - Life on Land
description Fonseca, J., Douzas, G., & Bacao, F. (2021). Improving imbalanced land cover classification with k-means smote: Detecting and oversampling distinctive minority spectral signatures. Information (Switzerland), 12(7), 1-20. [266]. https://doi.org/10.3390/info12070266
publishDate 2021
dc.date.none.fl_str_mv 2021-07-16T22:21:57Z
2021-07-01
2021-07-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/121176
url http://hdl.handle.net/10362/121176
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2078-2489
PURE: 32603681
https://doi.org/10.3390/info12070266
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 20
application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138053252448256