The Reference Corpus of Contemporary Portuguese and related resources
Autor(a) principal: | |
---|---|
Data de Publicação: | 2014 |
Outros Autores: | , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10451/30786 |
Resumo: | The extraordinary growth of computer applications, particularly over the last two decades, has enabled the easy compilation and exploration of large corpora and lexica. These linguistic resources play a fundamental role in the areas of theoretical linguistics and natural language engineering. Combining these two areas of knowledge can, in fact, result in the development of a large number of applications, such as new and straightforward descriptions of languages based on real data; contrastive studies between varieties of a particular language aiming at finding factors of unity and diversity; cross-linguistic contrastive studies; grammars; lexica and dictionaries; terminologies; assisted translation materials; language teaching materials; computer tools and applications for processing natural language. Having this principle in mind and following the tradition at the Centre of Linguistics of the University of Lisbon (CLUL)i of collecting and studying real language data, a large electronic corpus – the Corpus de Referência do Português Contemporâneo (Reference Corpus of Contemporary Portuguese, CRPC) – is being compiled at CLUL since 1988. The CRPC currently contains approximately 310 million words, searchable through a user-friendly interface, and it is envisaged as a monitor corpus (from which one can extract balanced subcorpora) that can serve as a sample of the Portuguese language (both in its written and spoken varieties). In the next sections, we will describe the CRPC and how it forms the basis for important resources developed at CLUL. |
id |
RCAP_a1fc4256d6fba8084bd11f35f6c9fe7c |
---|---|
oai_identifier_str |
oai:repositorio.ul.pt:10451/30786 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
The Reference Corpus of Contemporary Portuguese and related resourcesThe extraordinary growth of computer applications, particularly over the last two decades, has enabled the easy compilation and exploration of large corpora and lexica. These linguistic resources play a fundamental role in the areas of theoretical linguistics and natural language engineering. Combining these two areas of knowledge can, in fact, result in the development of a large number of applications, such as new and straightforward descriptions of languages based on real data; contrastive studies between varieties of a particular language aiming at finding factors of unity and diversity; cross-linguistic contrastive studies; grammars; lexica and dictionaries; terminologies; assisted translation materials; language teaching materials; computer tools and applications for processing natural language. Having this principle in mind and following the tradition at the Centre of Linguistics of the University of Lisbon (CLUL)i of collecting and studying real language data, a large electronic corpus – the Corpus de Referência do Português Contemporâneo (Reference Corpus of Contemporary Portuguese, CRPC) – is being compiled at CLUL since 1988. The CRPC currently contains approximately 310 million words, searchable through a user-friendly interface, and it is envisaged as a monitor corpus (from which one can extract balanced subcorpora) that can serve as a sample of the Portuguese language (both in its written and spoken varieties). In the next sections, we will describe the CRPC and how it forms the basis for important resources developed at CLUL.Bloomsbury PublishingRepositório da Universidade de LisboaNascimento, Maria Fernanda Bacelar doMendes, AmáliaAntunes, SandraPereira, Luísa2018-01-22T10:10:48Z20142014-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10451/30786engBacelar do Nascimento, Maria Fernanda, Amália Mendes, Sandra Antunes, Luísa Pereira (2014) “The Reference Corpus of Contemporary Portuguese and related resources”, in Berber Sardinha, Tony and Telma Ferreira (eds.) Working with Portuguese Corpora. London: Bloomsbury Publishinginfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T16:23:37Zoai:repositorio.ul.pt:10451/30786Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:46:19.655430Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
The Reference Corpus of Contemporary Portuguese and related resources |
title |
The Reference Corpus of Contemporary Portuguese and related resources |
spellingShingle |
The Reference Corpus of Contemporary Portuguese and related resources Nascimento, Maria Fernanda Bacelar do |
title_short |
The Reference Corpus of Contemporary Portuguese and related resources |
title_full |
The Reference Corpus of Contemporary Portuguese and related resources |
title_fullStr |
The Reference Corpus of Contemporary Portuguese and related resources |
title_full_unstemmed |
The Reference Corpus of Contemporary Portuguese and related resources |
title_sort |
The Reference Corpus of Contemporary Portuguese and related resources |
author |
Nascimento, Maria Fernanda Bacelar do |
author_facet |
Nascimento, Maria Fernanda Bacelar do Mendes, Amália Antunes, Sandra Pereira, Luísa |
author_role |
author |
author2 |
Mendes, Amália Antunes, Sandra Pereira, Luísa |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Repositório da Universidade de Lisboa |
dc.contributor.author.fl_str_mv |
Nascimento, Maria Fernanda Bacelar do Mendes, Amália Antunes, Sandra Pereira, Luísa |
description |
The extraordinary growth of computer applications, particularly over the last two decades, has enabled the easy compilation and exploration of large corpora and lexica. These linguistic resources play a fundamental role in the areas of theoretical linguistics and natural language engineering. Combining these two areas of knowledge can, in fact, result in the development of a large number of applications, such as new and straightforward descriptions of languages based on real data; contrastive studies between varieties of a particular language aiming at finding factors of unity and diversity; cross-linguistic contrastive studies; grammars; lexica and dictionaries; terminologies; assisted translation materials; language teaching materials; computer tools and applications for processing natural language. Having this principle in mind and following the tradition at the Centre of Linguistics of the University of Lisbon (CLUL)i of collecting and studying real language data, a large electronic corpus – the Corpus de Referência do Português Contemporâneo (Reference Corpus of Contemporary Portuguese, CRPC) – is being compiled at CLUL since 1988. The CRPC currently contains approximately 310 million words, searchable through a user-friendly interface, and it is envisaged as a monitor corpus (from which one can extract balanced subcorpora) that can serve as a sample of the Portuguese language (both in its written and spoken varieties). In the next sections, we will describe the CRPC and how it forms the basis for important resources developed at CLUL. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014 2014-01-01T00:00:00Z 2018-01-22T10:10:48Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10451/30786 |
url |
http://hdl.handle.net/10451/30786 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Bacelar do Nascimento, Maria Fernanda, Amália Mendes, Sandra Antunes, Luísa Pereira (2014) “The Reference Corpus of Contemporary Portuguese and related resources”, in Berber Sardinha, Tony and Telma Ferreira (eds.) Working with Portuguese Corpora. London: Bloomsbury Publishing |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Bloomsbury Publishing |
publisher.none.fl_str_mv |
Bloomsbury Publishing |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134387601670144 |