The C-ORAL-Brasil project for brazilian portuguese spoken corpora

Detalhes bibliográficos
Autor(a) principal: Lucia de Almeida Ferrari
Data de Publicação: 2020
Outros Autores: Giulia Bossaglia
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Institucional da UFMG
Texto Completo: https://doi.org/10.7203/caplletra.69.17269
http://hdl.handle.net/1843/57812
https://orcid.org/0000-0002-9855-0646
https://orcid.org/0000-0001-8839-3088
Resumo: In this paper we present a specific subset of spoken corpora of the C-ORAL family, namely the C-ORAL-BRASIL corpora of spontaneous Brazilian Portuguese (BP). Stemmed as the non- European branch of the C-ORAL-ROM project (Cresti & Moneglia 2005), the C-ORAL-BRASIL project has compiled third generation corpora of spoken BP, outstanding not only as specific BP corpora, but also as a model tool for the study of spoken language in general, also thanks to some methodological and technological improvements. Beside the resources for the study of spoken BP, a set of minicorpora compiled for specific studies on information structure (also in languages other than BP) are presented, together with other ongoing compilation processes developed within the C-ORAL-BRASIL research group. All the published resources are available for download at <www.c-oral-brasil.org>.
id UFMG_3154c2377b6c7e29c379fe0cc8e589ad
oai_identifier_str oai:repositorio.ufmg.br:1843/57812
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling 2023-08-14T20:42:52Z2023-08-14T20:42:52Z202069201220https://doi.org/10.7203/caplletra.69.1726923867159http://hdl.handle.net/1843/57812https://orcid.org/0000-0002-9855-0646https://orcid.org/0000-0001-8839-3088In this paper we present a specific subset of spoken corpora of the C-ORAL family, namely the C-ORAL-BRASIL corpora of spontaneous Brazilian Portuguese (BP). Stemmed as the non- European branch of the C-ORAL-ROM project (Cresti & Moneglia 2005), the C-ORAL-BRASIL project has compiled third generation corpora of spoken BP, outstanding not only as specific BP corpora, but also as a model tool for the study of spoken language in general, also thanks to some methodological and technological improvements. Beside the resources for the study of spoken BP, a set of minicorpora compiled for specific studies on information structure (also in languages other than BP) are presented, together with other ongoing compilation processes developed within the C-ORAL-BRASIL research group. All the published resources are available for download at <www.c-oral-brasil.org>.En aquest article presentem un subconjunt de corpus orals de la família C-ORAL, concretament el corpus C-ORAL-BRASIL de portuguès brasiler espontani (PB). Derivat de la branca no-europea del projecte C-ORAL-ROM (Cresti & Moneglia 2005), el projecte C-ORAL-BRASIL ha aplegat uns corpus orals de PB de tercera generació, el qual destaca no sols com a corpus de PB, sinó també com una bona eina per a l’estudi de la llengua parlada en general, gràcies a algunes millores metodològiques i tecnològiques. A més dels recursos per a l’estudi del PB oral, presentem un conjunt de minicorpus creats per a l’estudi específic de l’estructura informativa (també en altres llengües a més del PB); així mateix, també tractem altres processos de compilació que estem desenvolupant actualment en el grup de recerca C-ORAL-BRASIL. Tots els recursos publicats estan disponibles a <www.c-oral-brasil.org> i es poden descarregar.engUniversidade Federal de Minas GeraisUFMGBrasilFALE - FACULDADE DE LETRASCaplletra- Revista Internacional de FilologiaAnálise linguísticaLinguística de corpusC-ORAL-BRASIL projectSpoken corporaCompilation best practicesBrazilian PortugueseThe C-ORAL-Brasil project for brazilian portuguese spoken corporaEl projecte C-ORAL-Brasil per a corpus orals dell portugués brasilelerinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://ojs.uv.es/index.php/caplletra/article/view/17269Lucia de Almeida FerrariGiulia Bossagliaapplication/pdfinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALThe c-oral-brasil project for brazilian portuguese spoken corpora.pdfThe c-oral-brasil project for brazilian portuguese spoken corpora.pdfapplication/pdf175308https://repositorio.ufmg.br/bitstream/1843/57812/2/The%20c-oral-brasil%20project%20for%20brazilian%20portuguese%20spoken%20corpora.pdf2a48e7abbeb16ea74cade058326abbc6MD52LICENSELicense.txtLicense.txttext/plain; charset=utf-82042https://repositorio.ufmg.br/bitstream/1843/57812/1/License.txtfa505098d172de0bc8864fc1287ffe22MD511843/578122023-08-14 17:42:52.516oai:repositorio.ufmg.br:1843/57812TElDRU7vv71BIERFIERJU1RSSUJVSe+/ve+/vU8gTu+/vU8tRVhDTFVTSVZBIERPIFJFUE9TSVTvv71SSU8gSU5TVElUVUNJT05BTCBEQSBVRk1HCiAKCkNvbSBhIGFwcmVzZW50Ye+/ve+/vW8gZGVzdGEgbGljZW7vv71hLCB2b2Pvv70gKG8gYXV0b3IgKGVzKSBvdSBvIHRpdHVsYXIgZG9zIGRpcmVpdG9zIGRlIGF1dG9yKSBjb25jZWRlIGFvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbu+/vW8gZXhjbHVzaXZvIGUgaXJyZXZvZ++/vXZlbCBkZSByZXByb2R1emlyIGUvb3UgZGlzdHJpYnVpciBhIHN1YSBwdWJsaWNh77+977+9byAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0cu+/vW5pY28gZSBlbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mg77+9dWRpbyBvdSB277+9ZGVvLgoKVm9j77+9IGRlY2xhcmEgcXVlIGNvbmhlY2UgYSBwb2zvv710aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2Pvv70gY29uY29yZGEgcXVlIG8gUmVwb3NpdO+/vXJpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSwgc2VtIGFsdGVyYXIgbyBjb250Ze+/vWRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNh77+977+9byBwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHvv73vv71vLgoKVm9j77+9IHRhbWLvv71tIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTvv71yaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPvv71waWEgZGUgc3VhIHB1YmxpY2Hvv73vv71vIHBhcmEgZmlucyBkZSBzZWd1cmFu77+9YSwgYmFjay11cCBlIHByZXNlcnZh77+977+9by4KClZvY++/vSBkZWNsYXJhIHF1ZSBhIHN1YSBwdWJsaWNh77+977+9byDvv70gb3JpZ2luYWwgZSBxdWUgdm9j77+9IHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vu77+9YS4gVm9j77+9IHRhbWLvv71tIGRlY2xhcmEgcXVlIG8gZGVw77+9c2l0byBkZSBzdWEgcHVibGljYe+/ve+/vW8gbu+/vW8sIHF1ZSBzZWphIGRlIHNldSBjb25oZWNpbWVudG8sIGluZnJpbmdlIGRpcmVpdG9zIGF1dG9yYWlzIGRlIG5pbmd177+9bS4KCkNhc28gYSBzdWEgcHVibGljYe+/ve+/vW8gY29udGVuaGEgbWF0ZXJpYWwgcXVlIHZvY++/vSBu77+9byBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2Pvv70gZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc++/vW8gaXJyZXN0cml0YSBkbyBkZXRlbnRvciBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgcGFyYSBjb25jZWRlciBhbyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7vv71hLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3Tvv70gY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250Ze+/vWRvIGRhIHB1YmxpY2Hvv73vv71vIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFBVQkxJQ0Hvv73vv71PIE9SQSBERVBPU0lUQURBIFRFTkhBIFNJRE8gUkVTVUxUQURPIERFIFVNIFBBVFJPQ++/vU5JTyBPVSBBUE9JTyBERSBVTUEgQUfvv71OQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0Pvv70gREVDTEFSQSBRVUUgUkVTUEVJVE9VIFRPRE9TIEUgUVVBSVNRVUVSIERJUkVJVE9TIERFIFJFVklT77+9TyBDT01PIFRBTULvv71NIEFTIERFTUFJUyBPQlJJR0Hvv73vv71FUyBFWElHSURBUyBQT1IgQ09OVFJBVE8gT1UgQUNPUkRPLgoKTyBSZXBvc2l077+9cmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNh77+977+9bywgZSBu77+9byBmYXLvv70gcXVhbHF1ZXIgYWx0ZXJh77+977+9bywgYWzvv71tIGRhcXVlbGFzIGNvbmNlZGlkYXMgcG9yIGVzdGEgbGljZW7vv71hLgo=Repositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2023-08-14T20:42:52Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.pt_BR.fl_str_mv The C-ORAL-Brasil project for brazilian portuguese spoken corpora
dc.title.alternative.pt_BR.fl_str_mv El projecte C-ORAL-Brasil per a corpus orals dell portugués brasileler
title The C-ORAL-Brasil project for brazilian portuguese spoken corpora
spellingShingle The C-ORAL-Brasil project for brazilian portuguese spoken corpora
Lucia de Almeida Ferrari
C-ORAL-BRASIL project
Spoken corpora
Compilation best practices
Brazilian Portuguese
Análise linguística
Linguística de corpus
title_short The C-ORAL-Brasil project for brazilian portuguese spoken corpora
title_full The C-ORAL-Brasil project for brazilian portuguese spoken corpora
title_fullStr The C-ORAL-Brasil project for brazilian portuguese spoken corpora
title_full_unstemmed The C-ORAL-Brasil project for brazilian portuguese spoken corpora
title_sort The C-ORAL-Brasil project for brazilian portuguese spoken corpora
author Lucia de Almeida Ferrari
author_facet Lucia de Almeida Ferrari
Giulia Bossaglia
author_role author
author2 Giulia Bossaglia
author2_role author
dc.contributor.author.fl_str_mv Lucia de Almeida Ferrari
Giulia Bossaglia
dc.subject.por.fl_str_mv C-ORAL-BRASIL project
Spoken corpora
Compilation best practices
Brazilian Portuguese
topic C-ORAL-BRASIL project
Spoken corpora
Compilation best practices
Brazilian Portuguese
Análise linguística
Linguística de corpus
dc.subject.other.pt_BR.fl_str_mv Análise linguística
Linguística de corpus
description In this paper we present a specific subset of spoken corpora of the C-ORAL family, namely the C-ORAL-BRASIL corpora of spontaneous Brazilian Portuguese (BP). Stemmed as the non- European branch of the C-ORAL-ROM project (Cresti & Moneglia 2005), the C-ORAL-BRASIL project has compiled third generation corpora of spoken BP, outstanding not only as specific BP corpora, but also as a model tool for the study of spoken language in general, also thanks to some methodological and technological improvements. Beside the resources for the study of spoken BP, a set of minicorpora compiled for specific studies on information structure (also in languages other than BP) are presented, together with other ongoing compilation processes developed within the C-ORAL-BRASIL research group. All the published resources are available for download at <www.c-oral-brasil.org>.
publishDate 2020
dc.date.issued.fl_str_mv 2020
dc.date.accessioned.fl_str_mv 2023-08-14T20:42:52Z
dc.date.available.fl_str_mv 2023-08-14T20:42:52Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1843/57812
dc.identifier.doi.pt_BR.fl_str_mv https://doi.org/10.7203/caplletra.69.17269
dc.identifier.issn.pt_BR.fl_str_mv 23867159
dc.identifier.orcid.pt_BR.fl_str_mv https://orcid.org/0000-0002-9855-0646
https://orcid.org/0000-0001-8839-3088
url https://doi.org/10.7203/caplletra.69.17269
http://hdl.handle.net/1843/57812
https://orcid.org/0000-0002-9855-0646
https://orcid.org/0000-0001-8839-3088
identifier_str_mv 23867159
dc.language.iso.fl_str_mv eng
language eng
dc.relation.ispartof.pt_BR.fl_str_mv Caplletra- Revista Internacional de Filologia
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.publisher.initials.fl_str_mv UFMG
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv FALE - FACULDADE DE LETRAS
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
bitstream.url.fl_str_mv https://repositorio.ufmg.br/bitstream/1843/57812/2/The%20c-oral-brasil%20project%20for%20brazilian%20portuguese%20spoken%20corpora.pdf
https://repositorio.ufmg.br/bitstream/1843/57812/1/License.txt
bitstream.checksum.fl_str_mv 2a48e7abbeb16ea74cade058326abbc6
fa505098d172de0bc8864fc1287ffe22
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv
_version_ 1803589387287527424