BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language
Autor(a) principal: | |
---|---|
Data de Publicação: | 2022 |
Outros Autores: | , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10174/32260 |
Resumo: | Computational medicine research requires clinical data for training and testing purposes, so the development of datasets composed of real hospital data is of utmost importance in this field. Most such data collections are in the English language, were collected in anglophone countries, and do not reflect other clinical realities, which increases the importance of national datasets for projects that hope to positively impact public health. This paper presents a new Brazilian Clinical Dataset containing over 70,000 admissions from 10 hospitals in two Brazilian states, composed of a sum total of over 2.5 million free-text clinical notes alongside data pertaining to patient information, prescription information, and exam results. This data was collected, organized, deidentified, and is being distributed via credentialed access for the use of the research community. In the course of presenting the new dataset, this paper will explore the new dataset’s structure, population, and potential benefits of using this dataset in clinical AI tasks. |
id |
RCAP_a24668274a873ebec1f8c69e35df6ae5 |
---|---|
oai_identifier_str |
oai:dspace.uevora.pt:10174/32260 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese LanguageComputational medicine research requires clinical data for training and testing purposes, so the development of datasets composed of real hospital data is of utmost importance in this field. Most such data collections are in the English language, were collected in anglophone countries, and do not reflect other clinical realities, which increases the importance of national datasets for projects that hope to positively impact public health. This paper presents a new Brazilian Clinical Dataset containing over 70,000 admissions from 10 hospitals in two Brazilian states, composed of a sum total of over 2.5 million free-text clinical notes alongside data pertaining to patient information, prescription information, and exam results. This data was collected, organized, deidentified, and is being distributed via credentialed access for the use of the research community. In the course of presenting the new dataset, this paper will explore the new dataset’s structure, population, and potential benefits of using this dataset in clinical AI tasks.FCT UIDB/00057/2020, CEECIND/01997/2017LREC2022-07-05T11:06:34Z2022-07-052022-06-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/32260http://hdl.handle.net/10174/32260engConsoli, B, Dias, H., Ulbrich, A., Vieira, R., Bordini, R. (2022) BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese LanguageProceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pages 5609–5616 Marseille, 20-25 June 2022 © European Language Resources Association (ELRA)http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.602.pdfndndrenatav@uevora.ptndnd299Consoli, BernardoDias, HenriqueVieira, RenataBordini, RafaelAna, Ulbrichinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:32:46Zoai:dspace.uevora.pt:10174/32260Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:21:18.424969Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
title |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
spellingShingle |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language Consoli, Bernardo |
title_short |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
title_full |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
title_fullStr |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
title_full_unstemmed |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
title_sort |
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language |
author |
Consoli, Bernardo |
author_facet |
Consoli, Bernardo Dias, Henrique Vieira, Renata Bordini, Rafael Ana, Ulbrich |
author_role |
author |
author2 |
Dias, Henrique Vieira, Renata Bordini, Rafael Ana, Ulbrich |
author2_role |
author author author author |
dc.contributor.author.fl_str_mv |
Consoli, Bernardo Dias, Henrique Vieira, Renata Bordini, Rafael Ana, Ulbrich |
description |
Computational medicine research requires clinical data for training and testing purposes, so the development of datasets composed of real hospital data is of utmost importance in this field. Most such data collections are in the English language, were collected in anglophone countries, and do not reflect other clinical realities, which increases the importance of national datasets for projects that hope to positively impact public health. This paper presents a new Brazilian Clinical Dataset containing over 70,000 admissions from 10 hospitals in two Brazilian states, composed of a sum total of over 2.5 million free-text clinical notes alongside data pertaining to patient information, prescription information, and exam results. This data was collected, organized, deidentified, and is being distributed via credentialed access for the use of the research community. In the course of presenting the new dataset, this paper will explore the new dataset’s structure, population, and potential benefits of using this dataset in clinical AI tasks. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-07-05T11:06:34Z 2022-07-05 2022-06-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10174/32260 http://hdl.handle.net/10174/32260 |
url |
http://hdl.handle.net/10174/32260 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Consoli, B, Dias, H., Ulbrich, A., Vieira, R., Bordini, R. (2022) BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese LanguageProceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pages 5609–5616 Marseille, 20-25 June 2022 © European Language Resources Association (ELRA) http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.602.pdf nd nd renatav@uevora.pt nd nd 299 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
LREC |
publisher.none.fl_str_mv |
LREC |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136694746742784 |