Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.1007/s00425-019-03116-3 http://hdl.handle.net/11449/184449 |
Resumo: | Main conclusion Machine-learning approaches (MLAs) for DNA barcoding outperform distance- and tree-based methods on identification accuracy and cost-effectiveness to arrive at species-level identification of wood. DNA barcoding is a promising tool to combat illegal logging and associated trade, and the development of reliable and efficient analytical methods is essential for its extensive application in the trade of wood and in the forensics of natural materials more broadly. In this study, 120 DNA sequences of four barcodes (ITS2, matK, ndhF-rp132, and rbcL) generated in our previous study and 85 downloaded from National Center for Biotechnology Information (NCBI) were collected to establish a reference data set for six commercial Pterocarpus woods. MLAs (BLOG, BP-neural network, SMO and J48) were compared with distance- (TaxonDNA) and tree-based (NJ tree) methods based on identification accuracy and cost-effectiveness across these six species, and also were applied to discriminate the CITES-listed species Pterocarpus santalinus from its anatomically similar species P. tinctorius for forensic identification. MLAs provided higher identification accuracy (30.8-100%) than distance- (15.1-97.4%) and tree-based methods (11.1-87.5%), with SMO performing the best among the machine learning classifiers. The two-locus combination ITS2 + matK when using SMO classifier exhibited the highest resolution (100%) with the fewest barcodes for discriminating the six Pterocarpus species. The CITES-listed species P. santalinus was discriminated successfully from P. tinctorius using MLAs with a single barcode, ndhF-rp132. This study shows that MLAs provided higher identification accuracy and cost-effectiveness for forensic application over other analytical methods in DNA barcoding of Pterocarpus wood. |
id |
UNSP_13dbd55c6bf27f89574bb9876815ca8a |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/184449 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus woodDNA barcodingForensic wood identificationIdentification accuracyMachine learning approaches (MLAs)PterocarpusSMO classifierMain conclusion Machine-learning approaches (MLAs) for DNA barcoding outperform distance- and tree-based methods on identification accuracy and cost-effectiveness to arrive at species-level identification of wood. DNA barcoding is a promising tool to combat illegal logging and associated trade, and the development of reliable and efficient analytical methods is essential for its extensive application in the trade of wood and in the forensics of natural materials more broadly. In this study, 120 DNA sequences of four barcodes (ITS2, matK, ndhF-rp132, and rbcL) generated in our previous study and 85 downloaded from National Center for Biotechnology Information (NCBI) were collected to establish a reference data set for six commercial Pterocarpus woods. MLAs (BLOG, BP-neural network, SMO and J48) were compared with distance- (TaxonDNA) and tree-based (NJ tree) methods based on identification accuracy and cost-effectiveness across these six species, and also were applied to discriminate the CITES-listed species Pterocarpus santalinus from its anatomically similar species P. tinctorius for forensic identification. MLAs provided higher identification accuracy (30.8-100%) than distance- (15.1-97.4%) and tree-based methods (11.1-87.5%), with SMO performing the best among the machine learning classifiers. The two-locus combination ITS2 + matK when using SMO classifier exhibited the highest resolution (100%) with the fewest barcodes for discriminating the six Pterocarpus species. The CITES-listed species P. santalinus was discriminated successfully from P. tinctorius using MLAs with a single barcode, ndhF-rp132. This study shows that MLAs provided higher identification accuracy and cost-effectiveness for forensic application over other analytical methods in DNA barcoding of Pterocarpus wood.National Natural Science Foundation of ChinaNational High-level Talent for Special Support Program of ChinaChina Scholarship CouncilChinese Acad Forestry, Chinese Res Inst Wood Ind, Dept Wood Anat & Utilizat, Beijing 100091, Peoples R ChinaChinese Acad Forestry, Wood Collect WOODPEDIA, Beijing 100091, Peoples R ChinaUS Forest Serv, Forest Prod Lab, Ctr Wood Anat Res, USDA, Madison, WI 53726 USAUniv Wisconsin, Dept Bot, Madison, WI 53706 USAPurdue Univ, Dept Forestry & Natl Resources, W Lafayette, IN 47907 USAUniv Estadual Paulista, Ciencias Biol Bot, Botucatu, SP, BrazilUniv Estadual Paulista, Ciencias Biol Bot, Botucatu, SP, BrazilNational Natural Science Foundation of China: 31600451National High-level Talent for Special Support Program of China: W02020331China Scholarship Council: 2017-3109SpringerChinese Acad ForestryUS Forest ServUniv WisconsinPurdue UnivUniversidade Estadual Paulista (Unesp)He, TuoJiao, LichaoWiedenhoeft, Alex C.Yin, Yafang2019-10-04T12:13:41Z2019-10-04T12:13:41Z2019-05-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article1617-1625http://dx.doi.org/10.1007/s00425-019-03116-3Planta. New York: Springer, v. 249, n. 5, p. 1617-1625, 2019.0032-0935http://hdl.handle.net/11449/18444910.1007/s00425-019-03116-3WOS:000464898700025Web of Sciencereponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengPlantainfo:eu-repo/semantics/openAccess2021-10-23T16:09:06Zoai:repositorio.unesp.br:11449/184449Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T13:54:46.197809Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
title |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
spellingShingle |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood He, Tuo DNA barcoding Forensic wood identification Identification accuracy Machine learning approaches (MLAs) Pterocarpus SMO classifier |
title_short |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
title_full |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
title_fullStr |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
title_full_unstemmed |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
title_sort |
Machine learning approaches outperform distance- and tree-based methods for DNA barcoding of Pterocarpus wood |
author |
He, Tuo |
author_facet |
He, Tuo Jiao, Lichao Wiedenhoeft, Alex C. Yin, Yafang |
author_role |
author |
author2 |
Jiao, Lichao Wiedenhoeft, Alex C. Yin, Yafang |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Chinese Acad Forestry US Forest Serv Univ Wisconsin Purdue Univ Universidade Estadual Paulista (Unesp) |
dc.contributor.author.fl_str_mv |
He, Tuo Jiao, Lichao Wiedenhoeft, Alex C. Yin, Yafang |
dc.subject.por.fl_str_mv |
DNA barcoding Forensic wood identification Identification accuracy Machine learning approaches (MLAs) Pterocarpus SMO classifier |
topic |
DNA barcoding Forensic wood identification Identification accuracy Machine learning approaches (MLAs) Pterocarpus SMO classifier |
description |
Main conclusion Machine-learning approaches (MLAs) for DNA barcoding outperform distance- and tree-based methods on identification accuracy and cost-effectiveness to arrive at species-level identification of wood. DNA barcoding is a promising tool to combat illegal logging and associated trade, and the development of reliable and efficient analytical methods is essential for its extensive application in the trade of wood and in the forensics of natural materials more broadly. In this study, 120 DNA sequences of four barcodes (ITS2, matK, ndhF-rp132, and rbcL) generated in our previous study and 85 downloaded from National Center for Biotechnology Information (NCBI) were collected to establish a reference data set for six commercial Pterocarpus woods. MLAs (BLOG, BP-neural network, SMO and J48) were compared with distance- (TaxonDNA) and tree-based (NJ tree) methods based on identification accuracy and cost-effectiveness across these six species, and also were applied to discriminate the CITES-listed species Pterocarpus santalinus from its anatomically similar species P. tinctorius for forensic identification. MLAs provided higher identification accuracy (30.8-100%) than distance- (15.1-97.4%) and tree-based methods (11.1-87.5%), with SMO performing the best among the machine learning classifiers. The two-locus combination ITS2 + matK when using SMO classifier exhibited the highest resolution (100%) with the fewest barcodes for discriminating the six Pterocarpus species. The CITES-listed species P. santalinus was discriminated successfully from P. tinctorius using MLAs with a single barcode, ndhF-rp132. This study shows that MLAs provided higher identification accuracy and cost-effectiveness for forensic application over other analytical methods in DNA barcoding of Pterocarpus wood. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-10-04T12:13:41Z 2019-10-04T12:13:41Z 2019-05-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1007/s00425-019-03116-3 Planta. New York: Springer, v. 249, n. 5, p. 1617-1625, 2019. 0032-0935 http://hdl.handle.net/11449/184449 10.1007/s00425-019-03116-3 WOS:000464898700025 |
url |
http://dx.doi.org/10.1007/s00425-019-03116-3 http://hdl.handle.net/11449/184449 |
identifier_str_mv |
Planta. New York: Springer, v. 249, n. 5, p. 1617-1625, 2019. 0032-0935 10.1007/s00425-019-03116-3 WOS:000464898700025 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Planta |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
1617-1625 |
dc.publisher.none.fl_str_mv |
Springer |
publisher.none.fl_str_mv |
Springer |
dc.source.none.fl_str_mv |
Web of Science reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808128289262272512 |