Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability

Detalhes bibliográficos
Autor(a) principal: Yokochi, Clara
Data de Publicação: 2023
Outros Autores: Bispo, Regina, Ricardo, Fernando, Calado, Ricardo
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/164075
Resumo: Funding Information: Open access funding provided by FCT|FCCN (b-on). This work is funded by national funds through the FCT - Fundação para a Ciência e a Tecnologia, I.P., under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020 (Center for Mathematics and Applications). Publisher Copyright: © 2023, The Author(s).
id RCAP_6fc31eef8e60fa3dface1ccbdb40288a
oai_identifier_str oai:run.unl.pt:10362/164075
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Regularization Methods for High-Dimensional Data as a Tool for Seafood TraceabilityElastic netLASSORegularizationRidge regressionTraceabilityStatistics and ProbabilitySDG 3 - Good Health and Well-beingSDG 14 - Life Below WaterFunding Information: Open access funding provided by FCT|FCCN (b-on). This work is funded by national funds through the FCT - Fundação para a Ciência e a Tecnologia, I.P., under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020 (Center for Mathematics and Applications). Publisher Copyright: © 2023, The Author(s).Seafood traceability, needed to regulate food safety, control fisheries, combat fraud, and prevent jeopardizing public health from harvesting in polluted locations, depends heavily on the prediction of the geographic origin of seafood. When the available datasets to study traceability are high-dimensional, standard classic statistical models fail. Under these circumstances, proper alternative methods are needed to predict accurately the geographic origin of seafood. In this study, we propose an analytical approach combining the use of regularization methods and resampling techniques to overcome the high-dimensionality problem. In particular, we analyze comparatively the Ridge regression, LASSO and Elastic net penalty-based approaches. These methods were applied to predict the origin of the saltwater clam Ruditapes philippinarum, a non-indigenous and commercially very relevant marine bivalve species that occurs commonly in European estuaries. Further, the resampling method of Monte Carlo Cross-Validation was implemented to overcome challenges related to the small sample size. The results of the three methods were compared. For fully reproducibility, an R Markdown file and the used dataset are provided. We conclude highlighting the insights that this methodology may bring to model a multi-categorical response based on high-dimensional dataset, with highly correlated explanatory variables, and combat the mislabeling of geographic origin of seafood.CMA - Centro de Matemática e AplicaçõesDM - Departamento de MatemáticaRUNYokochi, ClaraBispo, ReginaRicardo, FernandoCalado, Ricardo2024-02-23T23:55:28Z2023-092023-09-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article21application/pdfhttp://hdl.handle.net/10362/164075eng1559-8608PURE: 83886940https://doi.org/10.1007/s42519-023-00341-8info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:50:40Zoai:run.unl.pt:10362/164075Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T04:00:02.034563Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
title Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
spellingShingle Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
Yokochi, Clara
Elastic net
LASSO
Regularization
Ridge regression
Traceability
Statistics and Probability
SDG 3 - Good Health and Well-being
SDG 14 - Life Below Water
title_short Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
title_full Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
title_fullStr Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
title_full_unstemmed Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
title_sort Regularization Methods for High-Dimensional Data as a Tool for Seafood Traceability
author Yokochi, Clara
author_facet Yokochi, Clara
Bispo, Regina
Ricardo, Fernando
Calado, Ricardo
author_role author
author2 Bispo, Regina
Ricardo, Fernando
Calado, Ricardo
author2_role author
author
author
dc.contributor.none.fl_str_mv CMA - Centro de Matemática e Aplicações
DM - Departamento de Matemática
RUN
dc.contributor.author.fl_str_mv Yokochi, Clara
Bispo, Regina
Ricardo, Fernando
Calado, Ricardo
dc.subject.por.fl_str_mv Elastic net
LASSO
Regularization
Ridge regression
Traceability
Statistics and Probability
SDG 3 - Good Health and Well-being
SDG 14 - Life Below Water
topic Elastic net
LASSO
Regularization
Ridge regression
Traceability
Statistics and Probability
SDG 3 - Good Health and Well-being
SDG 14 - Life Below Water
description Funding Information: Open access funding provided by FCT|FCCN (b-on). This work is funded by national funds through the FCT - Fundação para a Ciência e a Tecnologia, I.P., under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020 (Center for Mathematics and Applications). Publisher Copyright: © 2023, The Author(s).
publishDate 2023
dc.date.none.fl_str_mv 2023-09
2023-09-01T00:00:00Z
2024-02-23T23:55:28Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/164075
url http://hdl.handle.net/10362/164075
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 1559-8608
PURE: 83886940
https://doi.org/10.1007/s42519-023-00341-8
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 21
application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138176470614016