Introducing the Reference Corpus of Contemporary Portuguese On-Line
Autor(a) principal: | |
---|---|
Data de Publicação: | 2012 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10451/37429 |
Resumo: | We present our work in processing a Portuguese corpus and its publication online. After discussing how the corpus was built and our choice of meta-data, we turn to the processes and tools involved for the cleaning, preparation and annotation to make the corpus suitable for linguistic inquiries. The Web platform is described, and we show examples of linguistic resources that can be extracted from theplatform for use in linguistic studies or in NLP. |
id |
RCAP_35fc0828b3bb872dbbb1a57d5322aaee |
---|---|
oai_identifier_str |
oai:repositorio.ul.pt:10451/37429 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Introducing the Reference Corpus of Contemporary Portuguese On-LineWe present our work in processing a Portuguese corpus and its publication online. After discussing how the corpus was built and our choice of meta-data, we turn to the processes and tools involved for the cleaning, preparation and annotation to make the corpus suitable for linguistic inquiries. The Web platform is described, and we show examples of linguistic resources that can be extracted from theplatform for use in linguistic studies or in NLP.European Language Resources AssociationRepositório da Universidade de LisboaGénéreux, MichelHendrickx, IrisMendes, Amália2019-03-11T13:47:12Z20122012-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10451/37429engGénéreux, M., Hendrickx, I. & Mendes, A. (2012): "Introducing the Reference Corpus of Contemporary Portuguese On-Line", in Proceedings of the Eighth International Conference on Language Resources and Evaluation - LREC 2012, Istanbul, May 21-27, 2012, pp. 2237-2244.978-2-9517408-7-7info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T16:34:30Zoai:repositorio.ul.pt:10451/37429Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:51:25.889506Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
title |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
spellingShingle |
Introducing the Reference Corpus of Contemporary Portuguese On-Line Généreux, Michel |
title_short |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
title_full |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
title_fullStr |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
title_full_unstemmed |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
title_sort |
Introducing the Reference Corpus of Contemporary Portuguese On-Line |
author |
Généreux, Michel |
author_facet |
Généreux, Michel Hendrickx, Iris Mendes, Amália |
author_role |
author |
author2 |
Hendrickx, Iris Mendes, Amália |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Repositório da Universidade de Lisboa |
dc.contributor.author.fl_str_mv |
Généreux, Michel Hendrickx, Iris Mendes, Amália |
description |
We present our work in processing a Portuguese corpus and its publication online. After discussing how the corpus was built and our choice of meta-data, we turn to the processes and tools involved for the cleaning, preparation and annotation to make the corpus suitable for linguistic inquiries. The Web platform is described, and we show examples of linguistic resources that can be extracted from theplatform for use in linguistic studies or in NLP. |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012 2012-01-01T00:00:00Z 2019-03-11T13:47:12Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10451/37429 |
url |
http://hdl.handle.net/10451/37429 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Généreux, M., Hendrickx, I. & Mendes, A. (2012): "Introducing the Reference Corpus of Contemporary Portuguese On-Line", in Proceedings of the Eighth International Conference on Language Resources and Evaluation - LREC 2012, Istanbul, May 21-27, 2012, pp. 2237-2244. 978-2-9517408-7-7 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
European Language Resources Association |
publisher.none.fl_str_mv |
European Language Resources Association |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134450099945472 |