On building a tool for finding datasets based on a list of researchers or publications
Autor(a) principal: | |
---|---|
Data de Publicação: | 2021 |
Outros Autores: | |
Tipo de documento: | Artigo de conferência |
Idioma: | eng |
Título da fonte: | Repositório Institucional do IBICT - RIDI |
Texto Completo: | http://ridi.ibict.br/handle/123456789/1264 |
Resumo: | This proposal presents a tool developed in the Python language used to find related datasets of a list of researchers or publications. This tool was applied to a list of articles that a specific group of researchers had declared in their CVs. The target group was chosen based on the highest level that these researchers had obtained in a research productivity grant (1A). As a result, form a list of 1,227 researchers and more than 225 thousand deduplicated publications, it was possible to find 12,030 related datasets, were the most frequent access type is OPEN and the five most frequent related areas of research are Zoology; Chemistry; Genetics; Physics; and Agronomy. The proposed tool will be applied to facilitate populating the research data repository of the national funding agency in Brazil, but it can also be used in other more general contexts, extracting information from open databases, such as ORCID and Wikidata. |
id |
IBICT_b526d0bdcd30f4fac521a69b51cd727d |
---|---|
oai_identifier_str |
oai:ridi.ibict.br:123456789/1264 |
network_acronym_str |
IBICT |
network_name_str |
Repositório Institucional do IBICT - RIDI |
repository_id_str |
2404 |
spelling |
2023-11-16T15:16:16Z2021-062023-11-16T15:16:16Z2021-06http://ridi.ibict.br/handle/123456789/1264This proposal presents a tool developed in the Python language used to find related datasets of a list of researchers or publications. This tool was applied to a list of articles that a specific group of researchers had declared in their CVs. The target group was chosen based on the highest level that these researchers had obtained in a research productivity grant (1A). As a result, form a list of 1,227 researchers and more than 225 thousand deduplicated publications, it was possible to find 12,030 related datasets, were the most frequent access type is OPEN and the five most frequent related areas of research are Zoology; Chemistry; Genetics; Physics; and Agronomy. The proposed tool will be applied to facilitate populating the research data repository of the national funding agency in Brazil, but it can also be used in other more general contexts, extracting information from open databases, such as ORCID and Wikidata.This proposal presents a tool developed in the Python language used to find related datasets of a list of researchers or publications. This tool was applied to a list of articles that a specific group of researchers had declared in their CVs. The target group was chosen based on the highest level that these researchers had obtained in a research productivity grant (1A). As a result, form a list of 1,227 researchers and more than 225 thousand deduplicated publications, it was possible to find 12,030 related datasets, were the most frequent access type is OPEN and the five most frequent related areas of research are Zoology; Chemistry; Genetics; Physics; and Agronomy. The proposed tool will be applied to facilitate populating the research data repository of the national funding agency in Brazil, but it can also be used in other more general contexts, extracting information from open databases, such as ORCID and Wikidata.Submitted by Washington Segundo (washingtonsegundo@ibict.br) on 2023-11-16T15:15:57Z No. of bitstreams: 1 OR2021_A_tool_for_finding_datasets_based.pdf: 222838 bytes, checksum: 6f79fa8ea2221dbd35d40c6c772bba3a (MD5)Approved for entry into archive by Washington Segundo (washingtonsegundo@ibict.br) on 2023-11-16T15:16:16Z (GMT) No. of bitstreams: 1 OR2021_A_tool_for_finding_datasets_based.pdf: 222838 bytes, checksum: 6f79fa8ea2221dbd35d40c6c772bba3a (MD5)Made available in DSpace on 2023-11-16T15:16:16Z (GMT). No. of bitstreams: 1 OR2021_A_tool_for_finding_datasets_based.pdf: 222838 bytes, checksum: 6f79fa8ea2221dbd35d40c6c772bba3a (MD5) Previous issue date: 2021-06engInstituto Brasileiro de Informação em Ciência e TecnologiaIBICTBrasilInternational Open Repositories ConferenceCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAOOpen ScienceScientific Data RepositoriesScientific PublicationsOpen DataOn building a tool for finding datasets based on a list of researchers or publicationsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObject16Carvalho-Segundo, WashingtonM. R. Dias, Thiagoinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional do IBICT - RIDIinstname:Instituto Brasileiro de Informação em Ciência e Tecnologia (Ibict)instacron:IBICTTEXTOR2021_A_tool_for_finding_datasets_based.pdf.txtOR2021_A_tool_for_finding_datasets_based.pdf.txtExtracted texttext/plain13255https://ridi.ibict.br/bitstream/123456789/1264/3/OR2021_A_tool_for_finding_datasets_based.pdf.txtf15eb776caf6166bdb3f130d8d8fc1d8MD53LICENSElicense.txtlicense.txttext/plain; charset=utf-81862https://ridi.ibict.br/bitstream/123456789/1264/2/license.txt6b42f084aa6b52acc41c67281d72287fMD52ORIGINALOR2021_A_tool_for_finding_datasets_based.pdfOR2021_A_tool_for_finding_datasets_based.pdfapplication/pdf222838https://ridi.ibict.br/bitstream/123456789/1264/1/OR2021_A_tool_for_finding_datasets_based.pdf6f79fa8ea2221dbd35d40c6c772bba3aMD51123456789/12642023-11-17 03:00:28.382oai:ridi.ibict.br:123456789/1264TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEKCkNvbSBhIGFwcmVzZW50YcOnw6NvIGRlc3RhIGxpY2Vuw6dhLCB2b2PDqiAobyBhdXRvciAoZXMpIG91IG8gdGl0dWxhciBkb3MgZGlyZWl0b3MgZGUgYXV0b3IpIGNvbmNlZGUgYW8gUmVwb3NpdMOzcmlvIApJbnN0aXR1Y2lvbmFsIG8gZGlyZWl0byBuw6NvLWV4Y2x1c2l2byBkZSByZXByb2R1emlyLCAgdHJhZHV6aXIgKGNvbmZvcm1lIGRlZmluaWRvIGFiYWl4byksIGUvb3UgZGlzdHJpYnVpciBhIApzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIApmb3JtYXRvcyDDoXVkaW8gb3UgdsOtZGVvLgoKVm9jw6ogY29uY29yZGEgcXVlIG8gRGVwb3NpdGEgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250ZcO6ZG8sIHRyYW5zcG9yIGEgc3VhIHB1YmxpY2HDp8OjbyBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byAKcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJJREkgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgCmUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHB1YmxpY2HDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyBuZXN0YSBsaWNlbsOnYS4gClZvY8OqIHRhbWLDqW0gZGVjbGFyYSBxdWUgbyBkZXDDs3NpdG8gZGEgc3VhIHB1YmxpY2HDp8OjbyBuw6NvLCBxdWUgc2VqYSBkZSBzZXUgY29uaGVjaW1lbnRvLCBpbmZyaW5nZSBkaXJlaXRvcyBhdXRvcmFpcyAKZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSAKb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIGFvIERlcG9zaXRhIG9zIGRpcmVpdG9zIGFwcmVzZW50YWRvcyAKbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gCm91IG5vIGNvbnRlw7pkbyBkYSBwdWJsaWNhw6fDo28gb3JhIGRlcG9zaXRhZGEuCgpDQVNPIEEgUFVCTElDQcOHw4NPIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ8ONTklPIE9VIEFQT0lPIERFIFVNQSBBR8OKTkNJQSBERSBGT01FTlRPIE9VIE9VVFJPIApPUkdBTklTTU8sIFZPQ8OKIERFQ0xBUkEgUVVFIFJFU1BFSVRPVSBUT0RPUyBFIFFVQUlTUVVFUiBESVJFSVRPUyBERSBSRVZJU8ODTyBDT01PIFRBTULDiU0gQVMgREVNQUlTIE9CUklHQcOHw5VFUyAKRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gRGVwb3NpdGEgc2UgY29tcHJvbWV0ZSBhIGlkZW50aWZpY2FyIGNsYXJhbWVudGUgbyBzZXUgbm9tZSAocykgb3UgbyhzKSBub21lKHMpIGRvKHMpIGRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgCmF1dG9yYWlzIGRhIHB1YmxpY2HDp8OjbywgZSBuw6NvIGZhcsOhIHF1YWxxdWVyIGFsdGVyYcOnw6NvLCBhbMOpbSBkYXF1ZWxhcyBjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Repositório InstitucionalPUBhttps://ridi.ibict.br/oai/requestrd@ibict.bropendoar:24042023-11-17T06:00:28Repositório Institucional do IBICT - RIDI - Instituto Brasileiro de Informação em Ciência e Tecnologia (Ibict)false |
dc.title.pt_BR.fl_str_mv |
On building a tool for finding datasets based on a list of researchers or publications |
title |
On building a tool for finding datasets based on a list of researchers or publications |
spellingShingle |
On building a tool for finding datasets based on a list of researchers or publications Carvalho-Segundo, Washington CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO Open Science Scientific Data Repositories Scientific Publications Open Data |
title_short |
On building a tool for finding datasets based on a list of researchers or publications |
title_full |
On building a tool for finding datasets based on a list of researchers or publications |
title_fullStr |
On building a tool for finding datasets based on a list of researchers or publications |
title_full_unstemmed |
On building a tool for finding datasets based on a list of researchers or publications |
title_sort |
On building a tool for finding datasets based on a list of researchers or publications |
author |
Carvalho-Segundo, Washington |
author_facet |
Carvalho-Segundo, Washington M. R. Dias, Thiago |
author_role |
author |
author2 |
M. R. Dias, Thiago |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Carvalho-Segundo, Washington M. R. Dias, Thiago |
dc.subject.cnpq.fl_str_mv |
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO |
topic |
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO Open Science Scientific Data Repositories Scientific Publications Open Data |
dc.subject.por.fl_str_mv |
Open Science Scientific Data Repositories Scientific Publications Open Data |
description |
This proposal presents a tool developed in the Python language used to find related datasets of a list of researchers or publications. This tool was applied to a list of articles that a specific group of researchers had declared in their CVs. The target group was chosen based on the highest level that these researchers had obtained in a research productivity grant (1A). As a result, form a list of 1,227 researchers and more than 225 thousand deduplicated publications, it was possible to find 12,030 related datasets, were the most frequent access type is OPEN and the five most frequent related areas of research are Zoology; Chemistry; Genetics; Physics; and Agronomy. The proposed tool will be applied to facilitate populating the research data repository of the national funding agency in Brazil, but it can also be used in other more general contexts, extracting information from open databases, such as ORCID and Wikidata. |
publishDate |
2021 |
dc.date.available.fl_str_mv |
2021-06 2023-11-16T15:16:16Z |
dc.date.issued.fl_str_mv |
2021-06 |
dc.date.accessioned.fl_str_mv |
2023-11-16T15:16:16Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/conferenceObject |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://ridi.ibict.br/handle/123456789/1264 |
url |
http://ridi.ibict.br/handle/123456789/1264 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.ispartof.pt_BR.fl_str_mv |
International Open Repositories Conference |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Instituto Brasileiro de Informação em Ciência e Tecnologia |
dc.publisher.initials.fl_str_mv |
IBICT |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Instituto Brasileiro de Informação em Ciência e Tecnologia |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional do IBICT - RIDI instname:Instituto Brasileiro de Informação em Ciência e Tecnologia (Ibict) instacron:IBICT |
instname_str |
Instituto Brasileiro de Informação em Ciência e Tecnologia (Ibict) |
instacron_str |
IBICT |
institution |
IBICT |
reponame_str |
Repositório Institucional do IBICT - RIDI |
collection |
Repositório Institucional do IBICT - RIDI |
bitstream.url.fl_str_mv |
https://ridi.ibict.br/bitstream/123456789/1264/3/OR2021_A_tool_for_finding_datasets_based.pdf.txt https://ridi.ibict.br/bitstream/123456789/1264/2/license.txt https://ridi.ibict.br/bitstream/123456789/1264/1/OR2021_A_tool_for_finding_datasets_based.pdf |
bitstream.checksum.fl_str_mv |
f15eb776caf6166bdb3f130d8d8fc1d8 6b42f084aa6b52acc41c67281d72287f 6f79fa8ea2221dbd35d40c6c772bba3a |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional do IBICT - RIDI - Instituto Brasileiro de Informação em Ciência e Tecnologia (Ibict) |
repository.mail.fl_str_mv |
rd@ibict.br |
_version_ |
1823424642655715328 |