The nature of scientific datasets in South American repositories: a survey of formats and extensions
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Repositório Institucional da UNESP |
Texto Completo: | http://dx.doi.org/10.5007/1518-2924.2022.e85148 http://hdl.handle.net/11449/237571 |
Resumo: | Objective: identifying the scientific data repositories created and managed by Higher Education Institutions and/or South American research and funding agencies; identifying and describing the formats and extensions of files that compile the scientific datasets deposited in these repositories. Methods: eight repositories retrieved by RE3DATA were selected for investigation. A population (N) of 1.115 scientific datasets was obtained. By using Stratified Random Sampling, the resulting sample (n) value was 258 datasets, which corresponds to 23,15% of the population (N). Data surveyed from the samples were condensed into tables and charts. Results: it was noticed that the nature of the scientific datasets investigated is centered on textual and numerical data, saved in text files and tables, respectively. Also, the datasets may be either homogeneous (one or more files saved in a unique format and extension, e.g.: image format in.jpg) or heterogeneous (files saved in different formats and extensions, content of the data, as observed in the .gpx and gdb extensions, which refer to geospatial data, therefore, alphanumeric data. Conclusions: There is a growing need of describing the nature of data, as well as the formats and extensions of files. This kind of descriptive metadata would be valuable to potential users, as it would allow a greater understanding of the context of the data, focusing on data reuse. |
id |
UNSP_344b8409046d931f027109c67f727fa8 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/237571 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
The nature of scientific datasets in South American repositories: a survey of formats and extensionsScientific dataDatasetsData repositoriesFormats and extensionsSurveyObjective: identifying the scientific data repositories created and managed by Higher Education Institutions and/or South American research and funding agencies; identifying and describing the formats and extensions of files that compile the scientific datasets deposited in these repositories. Methods: eight repositories retrieved by RE3DATA were selected for investigation. A population (N) of 1.115 scientific datasets was obtained. By using Stratified Random Sampling, the resulting sample (n) value was 258 datasets, which corresponds to 23,15% of the population (N). Data surveyed from the samples were condensed into tables and charts. Results: it was noticed that the nature of the scientific datasets investigated is centered on textual and numerical data, saved in text files and tables, respectively. Also, the datasets may be either homogeneous (one or more files saved in a unique format and extension, e.g.: image format in.jpg) or heterogeneous (files saved in different formats and extensions, content of the data, as observed in the .gpx and gdb extensions, which refer to geospatial data, therefore, alphanumeric data. Conclusions: There is a growing need of describing the nature of data, as well as the formats and extensions of files. This kind of descriptive metadata would be valuable to potential users, as it would allow a greater understanding of the context of the data, focusing on data reuse.Univ Fed Minas Gerais, Doutorando Gestao & Org Conhecimento, Belo Horizonte, MG, BrazilUniv Fed Minas Gerais, Ciencia Informacao, Belo Horizonte, MG, BrazilUniv Fed Minas Gerais, Escola Ciencia Informacao, Belo Horizonte, MG, BrazilUniv Fed Paraiba, Dept Ciencia Informacao, Joao Pessoa, Paraiba, BrazilUniv Estadual Paulista, Ciencia Informacao, Sao Paulo, BrazilUniv Estadual Paulista, Ciencia Informacao, Sao Paulo, BrazilUniv Federal Santa CatarinaUniversidade Federal de Minas Gerais (UFMG)Univ Fed ParaibaUniversidade Estadual Paulista (UNESP)Rodrigues, Marcello MundimLourenco, Cintia de AzevedoDias, Guilherme Ataide [UNESP]2022-11-30T13:38:57Z2022-11-30T13:38:57Z2022-01-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article26http://dx.doi.org/10.5007/1518-2924.2022.e85148Encontros Bibli-revista Eletronica De Biblioteconomia E Ciencia Da Informacao. Florianopolis: Univ Federal Santa Catarina, v. 27, 26 p., 2022.1518-2924http://hdl.handle.net/11449/23757110.5007/1518-2924.2022.e85148WOS:000804414500004Web of Sciencereponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPporEncontros Bibli-revista Eletronica De Biblioteconomia E Ciencia Da Informacaoinfo:eu-repo/semantics/openAccess2022-11-30T13:38:57Zoai:repositorio.unesp.br:11449/237571Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestopendoar:29462024-08-05T19:51:57.626580Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
title |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
spellingShingle |
The nature of scientific datasets in South American repositories: a survey of formats and extensions Rodrigues, Marcello Mundim Scientific data Datasets Data repositories Formats and extensions Survey |
title_short |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
title_full |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
title_fullStr |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
title_full_unstemmed |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
title_sort |
The nature of scientific datasets in South American repositories: a survey of formats and extensions |
author |
Rodrigues, Marcello Mundim |
author_facet |
Rodrigues, Marcello Mundim Lourenco, Cintia de Azevedo Dias, Guilherme Ataide [UNESP] |
author_role |
author |
author2 |
Lourenco, Cintia de Azevedo Dias, Guilherme Ataide [UNESP] |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Universidade Federal de Minas Gerais (UFMG) Univ Fed Paraiba Universidade Estadual Paulista (UNESP) |
dc.contributor.author.fl_str_mv |
Rodrigues, Marcello Mundim Lourenco, Cintia de Azevedo Dias, Guilherme Ataide [UNESP] |
dc.subject.por.fl_str_mv |
Scientific data Datasets Data repositories Formats and extensions Survey |
topic |
Scientific data Datasets Data repositories Formats and extensions Survey |
description |
Objective: identifying the scientific data repositories created and managed by Higher Education Institutions and/or South American research and funding agencies; identifying and describing the formats and extensions of files that compile the scientific datasets deposited in these repositories. Methods: eight repositories retrieved by RE3DATA were selected for investigation. A population (N) of 1.115 scientific datasets was obtained. By using Stratified Random Sampling, the resulting sample (n) value was 258 datasets, which corresponds to 23,15% of the population (N). Data surveyed from the samples were condensed into tables and charts. Results: it was noticed that the nature of the scientific datasets investigated is centered on textual and numerical data, saved in text files and tables, respectively. Also, the datasets may be either homogeneous (one or more files saved in a unique format and extension, e.g.: image format in.jpg) or heterogeneous (files saved in different formats and extensions, content of the data, as observed in the .gpx and gdb extensions, which refer to geospatial data, therefore, alphanumeric data. Conclusions: There is a growing need of describing the nature of data, as well as the formats and extensions of files. This kind of descriptive metadata would be valuable to potential users, as it would allow a greater understanding of the context of the data, focusing on data reuse. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-11-30T13:38:57Z 2022-11-30T13:38:57Z 2022-01-01 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.5007/1518-2924.2022.e85148 Encontros Bibli-revista Eletronica De Biblioteconomia E Ciencia Da Informacao. Florianopolis: Univ Federal Santa Catarina, v. 27, 26 p., 2022. 1518-2924 http://hdl.handle.net/11449/237571 10.5007/1518-2924.2022.e85148 WOS:000804414500004 |
url |
http://dx.doi.org/10.5007/1518-2924.2022.e85148 http://hdl.handle.net/11449/237571 |
identifier_str_mv |
Encontros Bibli-revista Eletronica De Biblioteconomia E Ciencia Da Informacao. Florianopolis: Univ Federal Santa Catarina, v. 27, 26 p., 2022. 1518-2924 10.5007/1518-2924.2022.e85148 WOS:000804414500004 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
Encontros Bibli-revista Eletronica De Biblioteconomia E Ciencia Da Informacao |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
26 |
dc.publisher.none.fl_str_mv |
Univ Federal Santa Catarina |
publisher.none.fl_str_mv |
Univ Federal Santa Catarina |
dc.source.none.fl_str_mv |
Web of Science reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
|
_version_ |
1808129130455105536 |