Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus

Detalhes bibliográficos
Autor(a) principal: Marquilhas, Rita
Data de Publicação: 2014
Outros Autores: Hendrickx, Iris
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10451/30980
Resumo: The CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters.
id RCAP_cc9439e51f5811178a19afab11087d3c
oai_identifier_str oai:repositorio.ul.pt:10451/30980
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpusHistorical linguisticsSpelling variationAutomatic normalizationPortugueseThe CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters.Edinburgh University PressRepositório da Universidade de LisboaMarquilhas, RitaHendrickx, Iris2018-01-25T18:11:39Z20142014-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10451/30980engMARQUILHAS, Rita & HENDRICKZR. Marquilhas & I. Hendrickx Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus. International Journal of Humanities and Arts Computing, 8(1):65-80, 2014: Edinburgh University Press1753-8548https://doi.org/10.3366/ijhac.2014.0120info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2023-11-08T16:23:59Zoai:repositorio.ul.pt:10451/30980Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-19T21:46:30.166297Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
title Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
spellingShingle Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
Marquilhas, Rita
Historical linguistics
Spelling variation
Automatic normalization
Portuguese
title_short Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
title_full Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
title_fullStr Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
title_full_unstemmed Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
title_sort Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus
author Marquilhas, Rita
author_facet Marquilhas, Rita
Hendrickx, Iris
author_role author
author2 Hendrickx, Iris
author2_role author
dc.contributor.none.fl_str_mv Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Marquilhas, Rita
Hendrickx, Iris
dc.subject.por.fl_str_mv Historical linguistics
Spelling variation
Automatic normalization
Portuguese
topic Historical linguistics
Spelling variation
Automatic normalization
Portuguese
description The CARDS-FLY project aims to collect and transcribe a diverse sample of historical personal letters from the 16th to 20th century in a digital format to create a linguistic resource for the historical study of the Portuguese language and society. The letters were written by people from all social layers of society and their historical, social and pragmatic contexts are documented in the digital format. Here we study one particular aspect of this collection, namely the spelling variation. Furthermore, on the basis of this analysis, we improved a statistical spelling normalisation tool that we aim to use to automatically normalise the spelling in the full collection of digitised letters.
publishDate 2014
dc.date.none.fl_str_mv 2014
2014-01-01T00:00:00Z
2018-01-25T18:11:39Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10451/30980
url http://hdl.handle.net/10451/30980
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv MARQUILHAS, Rita & HENDRICKZR. Marquilhas & I. Hendrickx Manuscripts and machines: the automatic replacement of spelling variants in a Portuguese historical corpus. International Journal of Humanities and Arts Computing, 8(1):65-80, 2014: Edinburgh University Press
1753-8548
https://doi.org/10.3366/ijhac.2014.0120
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Edinburgh University Press
publisher.none.fl_str_mv Edinburgh University Press
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799134389248983040