Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
Autor(a) principal: | |
---|---|
Data de Publicação: | 2014 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10451/30980 |
Resumo: | The CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters. |
id |
RCAP_cc9439e51f5811178a19afab11087d3c |
---|---|
oai_identifier_str |
oai:repositorio.ul.pt:10451/30980 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpusHistorical linguisticsSpelling variationAutomatic normalizationPortugueseThe CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters.Edinburgh University PressRepositório da Universidade de LisboaMarquilhas, RitaHendrickx, Iris2018-01-25T18:11:39Z20142014-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10451/30980engMARQUILHAS, Rita & HENDRICKZR. Marquilhas & I. Hendrickx Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus. International Journal of Humanities and Arts Computing, 8(1):65-80, 2014: Edinburgh University Press1753-8548https://doi.org/10.3366/ijhac.2014.0120info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T16:23:59Zoai:repositorio.ul.pt:10451/30980Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:46:30.166297Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
title |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
spellingShingle |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus Marquilhas, Rita Historical linguistics Spelling variation Automatic normalization Portuguese |
title_short |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
title_full |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
title_fullStr |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
title_full_unstemmed |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
title_sort |
Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus |
author |
Marquilhas, Rita |
author_facet |
Marquilhas, Rita Hendrickx, Iris |
author_role |
author |
author2 |
Hendrickx, Iris |
author2_role |
author |
dc.contributor.none.fl_str_mv |
Repositório da Universidade de Lisboa |
dc.contributor.author.fl_str_mv |
Marquilhas, Rita Hendrickx, Iris |
dc.subject.por.fl_str_mv |
Historical linguistics Spelling variation Automatic normalization Portuguese |
topic |
Historical linguistics Spelling variation Automatic normalization Portuguese |
description |
The CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014 2014-01-01T00:00:00Z 2018-01-25T18:11:39Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10451/30980 |
url |
http://hdl.handle.net/10451/30980 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
MARQUILHAS, Rita & HENDRICKZR. Marquilhas & I. Hendrickx Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus. International Journal of Humanities and Arts Computing, 8(1):65-80, 2014: Edinburgh University Press 1753-8548 https://doi.org/10.3366/ijhac.2014.0120 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Edinburgh University Press |
publisher.none.fl_str_mv |
Edinburgh University Press |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799134389248983040 |