Machine learning approach to support taxonomic species discrimination based on helminth collections data

Detalhes bibliográficos
Autor(a) principal: Borba, Victor Hugo
Data de Publicação: 2021
Outros Autores: Martin, Coralie, Silva, José Roberto Machado, Xavier, Samanta C. C., Mello, Flávio L. de, Iñiguez, Alena Mayo
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da FIOCRUZ (ARCA)
Texto Completo: https://www.arca.fiocruz.br/handle/icict/47869
Resumo: Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Biologia de Tripanosomatídeos. Rio de Janeiro, RJ, Brasil / Universidade do Estado do Rio de Janeiro. Faculdade de Ciências Médicas. Laboratório de Helmintologia Romero Lascasas Porto. Rio de Janeiro, RJ, Brasil.
id CRUZ_1d45cf82f262f0aa10ec7bd6da60c117
oai_identifier_str oai:www.arca.fiocruz.br:icict/47869
network_acronym_str CRUZ
network_name_str Repositório Institucional da FIOCRUZ (ARCA)
repository_id_str 2135
spelling Borba, Victor HugoMartin, CoralieSilva, José Roberto MachadoXavier, Samanta C. C.Mello, Flávio L. deIñiguez, Alena Mayo2021-06-25T15:14:43Z2021-06-25T15:14:43Z2021BORBA, Victor Hugo et al. Machine learning approach to support taxonomic species discrimination based on helminth collections data. Parasites & Vectors, v. 14, n. 230, 15 p, 2021.1756-3305https://www.arca.fiocruz.br/handle/icict/4786910.1186/s13071-021-04721-6engBMCTaxonomiaInteligência ArtificialIdentificação de espéciesCapillaridaeOvos parasitasTaxonomyArtifcial intelligenceSpecies identifcationCapillaridaeParasite eggsMachine learning approach to support taxonomic species discrimination based on helminth collections datainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleFundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Biologia de Tripanosomatídeos. Rio de Janeiro, RJ, Brasil / Universidade do Estado do Rio de Janeiro. Faculdade de Ciências Médicas. Laboratório de Helmintologia Romero Lascasas Porto. Rio de Janeiro, RJ, Brasil.Unité Molécules de Communication et Adaptation des Microor‑ ganismes (MCAM, UMR 7245), Muséum National d’Histoire Naturelle, CNRS, CP52, Paris, France.Universidade do Estado do Rio de Janeiro. Faculdade de Ciências Médicas. Laboratório de Helmintologia Romero Lascasas Porto. Rio de Janeiro, RJ, Brasil.Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Biologia de Tripanosomatídeos. Rio de Janeiro, RJ, Brasil.Universidade Federal do Rio de Janeiro. Departamento de Engenheira Eletrônica e Computação. Rio de Janeiro, RJ, Brasil.Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Biologia de Tripanosomatídeos. Rio de Janeiro, RJ, Brasil.Background: There are more than 300 species of capillariids that parasitize various vertebrate groups worldwide. Species identifcation is hindered because of the few taxonomically informative structures available, making the task laborious and genus defnition controversial. Thus, its taxonomy is one of the most complex among Nematoda. Eggs are the parasitic structures most viewed in coprological analysis in both modern and ancient samples; consequently, their presence is indicative of positive diagnosis for infection. The structure of the egg could play a role in genera or species discrimination. Institutional biological collections are taxonomic repositories of specimens described and strictly identifed by systematics specialists. Methods: The present work aims to characterize eggs of capillariid species deposited in institutional helminth col‑ lections and to process the morphological, morphometric and ecological data using machine learning (ML) as a new approach for taxonomic identifcation. Specimens of 28 species and 8 genera deposited at Coleção Helmintológica do Instituto Oswaldo Cruz (CHIOC, IOC/FIOCRUZ/Brazil) and Collection de Nématodes Zooparasites du Muséum National d’Histoire Naturelle de Paris (MNHN/France) were examined under light microscopy. In the morphological and morphometric analyses (MM), the total length and width of eggs as well as plugs and shell thickness were con‑ sidered. In addition, eggshell ornamentations and ecological parameters of the geographical location (GL) and host (H) were included. Results: The performance of the logistic model tree (LMT) algorithm showed the highest values in all metrics com‑ pared with the other algorithms. Algorithm J48 produced the most reliable decision tree for species identifcation alongside REPTree. The Majority Voting algorithm showed high metric values, but the combined classifers did not attenuate the errors revealed in each algorithm alone. The statistical evaluation of the dataset indicated a signifcant diference between trees, with GL+H+MM and MM only with the best scores. Conclusions: The present research proposed a novel procedure for taxonomic species identifcation, integrating data from centenary biological collections and the logic of artifcial intelligence techniques. This study will support future research on taxonomic identifcation and diagnosis of both modern and archaeological capillariids.info:eu-repo/semantics/openAccessreponame:Repositório Institucional da FIOCRUZ (ARCA)instname:Fundação Oswaldo Cruz (FIOCRUZ)instacron:FIOCRUZLICENSElicense.txtlicense.txttext/plain; charset=utf-82991https://www.arca.fiocruz.br/bitstream/icict/47869/1/license.txt5a560609d32a3863062d77ff32785d58MD51ORIGINALAlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdfAlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdfapplication/pdf2612699https://www.arca.fiocruz.br/bitstream/icict/47869/2/AlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdf70735b0b75877a047aa6b58022ab5ab0MD52TEXTAlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdf.txtAlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdf.txtExtracted texttext/plain55825https://www.arca.fiocruz.br/bitstream/icict/47869/3/AlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdf.txt3ca6135d4759a572b347018fbdfe802bMD53icict/478692021-06-26 02:01:01.779oai:www.arca.fiocruz.br:icict/47869Q0VTU8ODTyBOw4NPIEVYQ0xVU0lWQSBERSBESVJFSVRPUyBBVVRPUkFJUwoKQW8gYWNlaXRhciBvcyBURVJNT1MgZSBDT05EScOHw5VFUyBkZXN0YSBDRVNTw4NPLCBvIEFVVE9SIGUvb3UgVElUVUxBUiBkZSBkaXJlaXRvcwphdXRvcmFpcyBzb2JyZSBhIE9CUkEgZGUgcXVlIHRyYXRhIGVzdGUgZG9jdW1lbnRvOgoKKDEpIENFREUgZSBUUkFOU0ZFUkUsIHRvdGFsIGUgZ3JhdHVpdGFtZW50ZSwgw6AgRklPQ1JVWiAtIEZVTkRBw4fDg08gT1NXQUxETyBDUlVaLCBlbQpjYXLDoXRlciBwZXJtYW5lbnRlLCBpcnJldm9nw6F2ZWwgZSBOw4NPIEVYQ0xVU0lWTywgdG9kb3Mgb3MgZGlyZWl0b3MgcGF0cmltb25pYWlzIE7Dg08KQ09NRVJDSUFJUyBkZSB1dGlsaXphw6fDo28gZGEgT0JSQSBhcnTDrXN0aWNhIGUvb3UgY2llbnTDrWZpY2EgaW5kaWNhZGEgYWNpbWEsIGluY2x1c2l2ZSBvcyBkaXJlaXRvcwpkZSB2b3ogZSBpbWFnZW0gdmluY3VsYWRvcyDDoCBPQlJBLCBkdXJhbnRlIHRvZG8gbyBwcmF6byBkZSBkdXJhw6fDo28gZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBlbQpxdWFscXVlciBpZGlvbWEgZSBlbSB0b2RvcyBvcyBwYcOtc2VzOwoKKDIpIEFDRUlUQSBxdWUgYSBjZXNzw6NvIHRvdGFsIG7Do28gZXhjbHVzaXZhLCBwZXJtYW5lbnRlIGUgaXJyZXZvZ8OhdmVsIGRvcyBkaXJlaXRvcyBhdXRvcmFpcwpwYXRyaW1vbmlhaXMgbsOjbyBjb21lcmNpYWlzIGRlIHV0aWxpemHDp8OjbyBkZSBxdWUgdHJhdGEgZXN0ZSBkb2N1bWVudG8gaW5jbHVpLCBleGVtcGxpZmljYXRpdmFtZW50ZSwKb3MgZGlyZWl0b3MgZGUgZGlzcG9uaWJpbGl6YcOnw6NvIGUgY29tdW5pY2HDp8OjbyBww7pibGljYSBkYSBPQlJBLCBlbSBxdWFscXVlciBtZWlvIG91IHZlw61jdWxvLAppbmNsdXNpdmUgZW0gUmVwb3NpdMOzcmlvcyBEaWdpdGFpcywgYmVtIGNvbW8gb3MgZGlyZWl0b3MgZGUgcmVwcm9kdcOnw6NvLCBleGliacOnw6NvLCBleGVjdcOnw6NvLApkZWNsYW1hw6fDo28sIHJlY2l0YcOnw6NvLCBleHBvc2nDp8OjbywgYXJxdWl2YW1lbnRvLCBpbmNsdXPDo28gZW0gYmFuY28gZGUgZGFkb3MsIHByZXNlcnZhw6fDo28sIGRpZnVzw6NvLApkaXN0cmlidWnDp8OjbywgZGl2dWxnYcOnw6NvLCBlbXByw6lzdGltbywgdHJhZHXDp8OjbywgZHVibGFnZW0sIGxlZ2VuZGFnZW0sIGluY2x1c8OjbyBlbSBub3ZhcyBvYnJhcyBvdQpjb2xldMOibmVhcywgcmV1dGlsaXphw6fDo28sIGVkacOnw6NvLCBwcm9kdcOnw6NvIGRlIG1hdGVyaWFsIGRpZMOhdGljbyBlIGN1cnNvcyBvdSBxdWFscXVlciBmb3JtYSBkZQp1dGlsaXphw6fDo28gbsOjbyBjb21lcmNpYWw7CgooMykgUkVDT05IRUNFIHF1ZSBhIGNlc3PDo28gYXF1aSBlc3BlY2lmaWNhZGEgY29uY2VkZSDDoCBGSU9DUlVaIC0gRlVOREHDh8ODTyBPU1dBTERPCkNSVVogbyBkaXJlaXRvIGRlIGF1dG9yaXphciBxdWFscXVlciBwZXNzb2Eg4oCTIGbDrXNpY2Egb3UganVyw61kaWNhLCBww7pibGljYSBvdSBwcml2YWRhLCBuYWNpb25hbCBvdQplc3RyYW5nZWlyYSDigJMgYSBhY2Vzc2FyIGUgdXRpbGl6YXIgYW1wbGFtZW50ZSBhIE9CUkEsIHNlbSBleGNsdXNpdmlkYWRlLCBwYXJhIHF1YWlzcXVlcgpmaW5hbGlkYWRlcyBuw6NvIGNvbWVyY2lhaXM7CgooNCkgREVDTEFSQSBxdWUgYSBvYnJhIMOpIGNyaWHDp8OjbyBvcmlnaW5hbCBlIHF1ZSDDqSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGFxdWkgY2VkaWRvcyBlIGF1dG9yaXphZG9zLApyZXNwb25zYWJpbGl6YW5kby1zZSBpbnRlZ3JhbG1lbnRlIHBlbG8gY29udGXDumRvIGUgb3V0cm9zIGVsZW1lbnRvcyBxdWUgZmF6ZW0gcGFydGUgZGEgT0JSQSwKaW5jbHVzaXZlIG9zIGRpcmVpdG9zIGRlIHZveiBlIGltYWdlbSB2aW5jdWxhZG9zIMOgIE9CUkEsIG9icmlnYW5kby1zZSBhIGluZGVuaXphciB0ZXJjZWlyb3MgcG9yCmRhbm9zLCBiZW0gY29tbyBpbmRlbml6YXIgZSByZXNzYXJjaXIgYSBGSU9DUlVaIC0gRlVOREHDh8ODTyBPU1dBTERPIENSVVogZGUKZXZlbnR1YWlzIGRlc3Blc2FzIHF1ZSB2aWVyZW0gYSBzdXBvcnRhciwgZW0gcmF6w6NvIGRlIHF1YWxxdWVyIG9mZW5zYSBhIGRpcmVpdG9zIGF1dG9yYWlzIG91CmRpcmVpdG9zIGRlIHZveiBvdSBpbWFnZW0sIHByaW5jaXBhbG1lbnRlIG5vIHF1ZSBkaXogcmVzcGVpdG8gYSBwbMOhZ2lvIGUgdmlvbGHDp8O1ZXMgZGUgZGlyZWl0b3M7CgooNSkgQUZJUk1BIHF1ZSBjb25oZWNlIGEgUG9sw610aWNhIEluc3RpdHVjaW9uYWwgZGUgQWNlc3NvIEFiZXJ0byBkYSBGSU9DUlVaIC0gRlVOREHDh8ODTwpPU1dBTERPIENSVVogZSBhcyBkaXJldHJpemVzIHBhcmEgbyBmdW5jaW9uYW1lbnRvIGRvIHJlcG9zaXTDs3JpbyBpbnN0aXR1Y2lvbmFsIEFSQ0EuCgpBIFBvbMOtdGljYSBJbnN0aXR1Y2lvbmFsIGRlIEFjZXNzbyBBYmVydG8gZGEgRklPQ1JVWiAtIEZVTkRBw4fDg08gT1NXQUxETyBDUlVaIHJlc2VydmEKZXhjbHVzaXZhbWVudGUgYW8gQVVUT1Igb3MgZGlyZWl0b3MgbW9yYWlzIGUgb3MgdXNvcyBjb21lcmNpYWlzIHNvYnJlIGFzIG9icmFzIGRlIHN1YSBhdXRvcmlhCmUvb3UgdGl0dWxhcmlkYWRlLCBzZW5kbyBvcyB0ZXJjZWlyb3MgdXN1w6FyaW9zIHJlc3BvbnPDoXZlaXMgcGVsYSBhdHJpYnVpw6fDo28gZGUgYXV0b3JpYSBlIG1hbnV0ZW7Dp8OjbwpkYSBpbnRlZ3JpZGFkZSBkYSBPQlJBIGVtIHF1YWxxdWVyIHV0aWxpemHDp8Ojby4KCkEgUG9sw610aWNhIEluc3RpdHVjaW9uYWwgZGUgQWNlc3NvIEFiZXJ0byBkYSBGSU9DUlVaIC0gRlVOREHDh8ODTyBPU1dBTERPIENSVVoKcmVzcGVpdGEgb3MgY29udHJhdG9zIGUgYWNvcmRvcyBwcmVleGlzdGVudGVzIGRvcyBBdXRvcmVzIGNvbSB0ZXJjZWlyb3MsIGNhYmVuZG8gYW9zIEF1dG9yZXMKaW5mb3JtYXIgw6AgSW5zdGl0dWnDp8OjbyBhcyBjb25kacOnw7VlcyBlIG91dHJhcyByZXN0cmnDp8O1ZXMgaW1wb3N0YXMgcG9yIGVzdGVzIGluc3RydW1lbnRvcy4KRepositório InstitucionalPUBhttps://www.arca.fiocruz.br/oai/requestrepositorio.arca@fiocruz.bropendoar:21352021-06-26T05:01:01Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)false
dc.title.pt_BR.fl_str_mv Machine learning approach to support taxonomic species discrimination based on helminth collections data
title Machine learning approach to support taxonomic species discrimination based on helminth collections data
spellingShingle Machine learning approach to support taxonomic species discrimination based on helminth collections data
Borba, Victor Hugo
Taxonomia
Inteligência Artificial
Identificação de espécies
Capillaridae
Ovos parasitas
Taxonomy
Artifcial intelligence
Species identifcation
Capillaridae
Parasite eggs
title_short Machine learning approach to support taxonomic species discrimination based on helminth collections data
title_full Machine learning approach to support taxonomic species discrimination based on helminth collections data
title_fullStr Machine learning approach to support taxonomic species discrimination based on helminth collections data
title_full_unstemmed Machine learning approach to support taxonomic species discrimination based on helminth collections data
title_sort Machine learning approach to support taxonomic species discrimination based on helminth collections data
author Borba, Victor Hugo
author_facet Borba, Victor Hugo
Martin, Coralie
Silva, José Roberto Machado
Xavier, Samanta C. C.
Mello, Flávio L. de
Iñiguez, Alena Mayo
author_role author
author2 Martin, Coralie
Silva, José Roberto Machado
Xavier, Samanta C. C.
Mello, Flávio L. de
Iñiguez, Alena Mayo
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Borba, Victor Hugo
Martin, Coralie
Silva, José Roberto Machado
Xavier, Samanta C. C.
Mello, Flávio L. de
Iñiguez, Alena Mayo
dc.subject.other.pt_BR.fl_str_mv Taxonomia
Inteligência Artificial
Identificação de espécies
Capillaridae
Ovos parasitas
topic Taxonomia
Inteligência Artificial
Identificação de espécies
Capillaridae
Ovos parasitas
Taxonomy
Artifcial intelligence
Species identifcation
Capillaridae
Parasite eggs
dc.subject.en.pt_BR.fl_str_mv Taxonomy
Artifcial intelligence
Species identifcation
Capillaridae
Parasite eggs
description Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. Laboratório de Biologia de Tripanosomatídeos. Rio de Janeiro, RJ, Brasil / Universidade do Estado do Rio de Janeiro. Faculdade de Ciências Médicas. Laboratório de Helmintologia Romero Lascasas Porto. Rio de Janeiro, RJ, Brasil.
publishDate 2021
dc.date.accessioned.fl_str_mv 2021-06-25T15:14:43Z
dc.date.available.fl_str_mv 2021-06-25T15:14:43Z
dc.date.issued.fl_str_mv 2021
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.citation.fl_str_mv BORBA, Victor Hugo et al. Machine learning approach to support taxonomic species discrimination based on helminth collections data. Parasites & Vectors, v. 14, n. 230, 15 p, 2021.
dc.identifier.uri.fl_str_mv https://www.arca.fiocruz.br/handle/icict/47869
dc.identifier.issn.pt_BR.fl_str_mv 1756-3305
dc.identifier.doi.none.fl_str_mv 10.1186/s13071-021-04721-6
identifier_str_mv BORBA, Victor Hugo et al. Machine learning approach to support taxonomic species discrimination based on helminth collections data. Parasites & Vectors, v. 14, n. 230, 15 p, 2021.
1756-3305
10.1186/s13071-021-04721-6
url https://www.arca.fiocruz.br/handle/icict/47869
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv BMC
publisher.none.fl_str_mv BMC
dc.source.none.fl_str_mv reponame:Repositório Institucional da FIOCRUZ (ARCA)
instname:Fundação Oswaldo Cruz (FIOCRUZ)
instacron:FIOCRUZ
instname_str Fundação Oswaldo Cruz (FIOCRUZ)
instacron_str FIOCRUZ
institution FIOCRUZ
reponame_str Repositório Institucional da FIOCRUZ (ARCA)
collection Repositório Institucional da FIOCRUZ (ARCA)
bitstream.url.fl_str_mv https://www.arca.fiocruz.br/bitstream/icict/47869/1/license.txt
https://www.arca.fiocruz.br/bitstream/icict/47869/2/AlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdf
https://www.arca.fiocruz.br/bitstream/icict/47869/3/AlenaMayoIniguez_VictorBorba_etal_IOC_2021.pdf.txt
bitstream.checksum.fl_str_mv 5a560609d32a3863062d77ff32785d58
70735b0b75877a047aa6b58022ab5ab0
3ca6135d4759a572b347018fbdfe802b
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da FIOCRUZ (ARCA) - Fundação Oswaldo Cruz (FIOCRUZ)
repository.mail.fl_str_mv repositorio.arca@fiocruz.br
_version_ 1798324971597987840