Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation

Mazza, Luciene Novais

Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation

Detalhes bibliográficos
Autor(a) principal:	Mazza, Luciene Novais
Data de Publicação:	2015
Tipo de documento:	Artigo
Idioma:	por
Título da fonte:	Calidoscópio (Online)
Texto Completo:	https://revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13
Resumo:	The present paper aims to demonstrate a computational tool developed to extract three-word lexical bundles and show – by working through this – the automatic recognition of recurring lexical items among regulatory documents. In this quantitative analysis a specific document prepared by pharmaceutical industries (in which the matter is directed related to the public health protection agencies) is generally examined. Nonetheless, the quantitative data collection methods can also be used to search any other linguistics features within a variety of genres and specific type of documents and it allows the linguistics researcher to easily identify which terms fall under a domain of specific texts. The study focus their main concern on investigating lexical pattern frequency of language use, particularly across the current context of business, and it seeks to spread Douglas Biber works based on recurrent word combinations that makes use of tools and techniques developed in corpus-based linguistics. As the theoretical framework for this study we primarily draw upon Corpus Linguistics, a theory that is able to connect its concepts over the computational assumptions and design tools for end users and extract the lexical bundles as well. The collected corpus gathers documents in English from fifteen different manufacturing sites of a multinational Pharmaceutical company, totaling about 110,000 words, whose limits include different writers among different geographic parts of the world. The investigation shows that it is possible to search text-internal features by the extraction of lexical bundle between data across the same specific-domain document.Keywords: lexical bundles, domain-specific corpus, linguistic-computational tool.

Metadados do item

id	Unisinos-3_7c00b2f7e28d5f1acb39b5c76ac580b0
oai_identifier_str	oai:ojs2.revistas.unisinos.br:article/9837
network_acronym_str	Unisinos-3
network_name_str	Calidoscópio (Online)
repository_id_str
spelling	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical RegulationProcessamento linguístico-computacional de pacotes lexicais: um estudo de corpus na área de Regulamentação FarmacêuticaThe present paper aims to demonstrate a computational tool developed to extract three-word lexical bundles and show – by working through this – the automatic recognition of recurring lexical items among regulatory documents. In this quantitative analysis a specific document prepared by pharmaceutical industries (in which the matter is directed related to the public health protection agencies) is generally examined. Nonetheless, the quantitative data collection methods can also be used to search any other linguistics features within a variety of genres and specific type of documents and it allows the linguistics researcher to easily identify which terms fall under a domain of specific texts. The study focus their main concern on investigating lexical pattern frequency of language use, particularly across the current context of business, and it seeks to spread Douglas Biber works based on recurrent word combinations that makes use of tools and techniques developed in corpus-based linguistics. As the theoretical framework for this study we primarily draw upon Corpus Linguistics, a theory that is able to connect its concepts over the computational assumptions and design tools for end users and extract the lexical bundles as well. The collected corpus gathers documents in English from fifteen different manufacturing sites of a multinational Pharmaceutical company, totaling about 110,000 words, whose limits include different writers among different geographic parts of the world. The investigation shows that it is possible to search text-internal features by the extraction of lexical bundle between data across the same specific-domain document.Keywords: lexical bundles, domain-specific corpus, linguistic-computational tool.Este trabalho tem por objetivo demonstrar um aplicativo computacional desenvolvido para a extração de pacotes lexicais de três palavras e apresentar por meio deste as unidades lexicais recorrentes entre documentos de especialidade. O método quantitativo aplicado, em princípio, explora um tipo de texto produzido pelas indústrias do setor farmacêutico, o qual está diretamente relacionado a assuntos regulatórios no âmbito das agências internacionais de vigilância sanitária. No entanto, os procedimentos de análise podem ser adotados para investigar outros aspectos linguísticos dentre a variedade de gêneros e tipos textuais, como também possibilita a identificação de termos. O estudo tem como principal enfoque a frequência de ocorrência dos padrões lexicais em corpus autêntico da língua em uso por meio de ferramentas linguístico-computacionais, em particular nas pesquisas voltadas ao estudo da linguagem em contextos empresariais, e busca multiplicar os trabalhos de Douglas Biber com base na combinação de palavras recorrentes em corpora específicos. O referencial teórico- -metodológico baseia-se na Linguística de Corpus, que é capaz de dialogar, especificamente, com a Linguística Computacional e oferecer meios para o desenvolvimento do aplicativo e ao processamento dos pacotes lexicais. O corpus coletado reúne quinze exemplares do documento escrito na língua inglesa, totalizando cerca de 110 mil palavras, cuja delimitação contempla diferentes localidades do mundo, envolvendo vários autores. Os resultados desvelam a possibilidade de investigação nas divisões internas dos textos mediante o cruzamento entre documentos de uma mesma especialidade.Palavras-chave: pacotes lexicais, corpus de especialidade, ferramenta linguístico-computacional.Unisinos2015-12-21info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13Calidoscópio; Vol. 13 No. 3 (2015): September/December; 424-439Calidoscópio; v. 13 n. 3 (2015): setembro/dezembro; 424-4392177-6202reponame:Calidoscópio (Online)instname:Universidade do Vale do Rio dos Sinos (UNISINOS)instacron:Unisinosporhttps://revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13/5071Mazza, Luciene Novaisinfo:eu-repo/semantics/openAccess2016-02-11T20:14:19Zoai:ojs2.revistas.unisinos.br:article/9837Revistahttps://revistas.unisinos.br/index.php/calidoscopioPUBhttps://revistas.unisinos.br/index.php/calidoscopio/oaicmira@unisinos.br \|\| cmira@unisinos.br2177-62022177-6202opendoar:2016-02-11T20:14:19Calidoscópio (Online) - Universidade do Vale do Rio dos Sinos (UNISINOS)false
dc.title.none.fl_str_mv	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation Processamento linguístico-computacional de pacotes lexicais: um estudo de corpus na área de Regulamentação Farmacêutica
title	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation
spellingShingle	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation Mazza, Luciene Novais
title_short	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation
title_full	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation
title_fullStr	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation
title_full_unstemmed	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation
title_sort	Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation
author	Mazza, Luciene Novais
author_facet	Mazza, Luciene Novais
author_role	author
dc.contributor.author.fl_str_mv	Mazza, Luciene Novais
description	The present paper aims to demonstrate a computational tool developed to extract three-word lexical bundles and show – by working through this – the automatic recognition of recurring lexical items among regulatory documents. In this quantitative analysis a specific document prepared by pharmaceutical industries (in which the matter is directed related to the public health protection agencies) is generally examined. Nonetheless, the quantitative data collection methods can also be used to search any other linguistics features within a variety of genres and specific type of documents and it allows the linguistics researcher to easily identify which terms fall under a domain of specific texts. The study focus their main concern on investigating lexical pattern frequency of language use, particularly across the current context of business, and it seeks to spread Douglas Biber works based on recurrent word combinations that makes use of tools and techniques developed in corpus-based linguistics. As the theoretical framework for this study we primarily draw upon Corpus Linguistics, a theory that is able to connect its concepts over the computational assumptions and design tools for end users and extract the lexical bundles as well. The collected corpus gathers documents in English from fifteen different manufacturing sites of a multinational Pharmaceutical company, totaling about 110,000 words, whose limits include different writers among different geographic parts of the world. The investigation shows that it is possible to search text-internal features by the extraction of lexical bundle between data across the same specific-domain document.Keywords: lexical bundles, domain-specific corpus, linguistic-computational tool.
publishDate	2015
dc.date.none.fl_str_mv	2015-12-21
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13
url	https://revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13
dc.language.iso.fl_str_mv	por
language	por
dc.relation.none.fl_str_mv	https://revistas.unisinos.br/index.php/calidoscopio/article/view/cld.2015.133.13/5071
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Unisinos
publisher.none.fl_str_mv	Unisinos
dc.source.none.fl_str_mv	Calidoscópio; Vol. 13 No. 3 (2015): September/December; 424-439 Calidoscópio; v. 13 n. 3 (2015): setembro/dezembro; 424-439 2177-6202 reponame:Calidoscópio (Online) instname:Universidade do Vale do Rio dos Sinos (UNISINOS) instacron:Unisinos
instname_str	Universidade do Vale do Rio dos Sinos (UNISINOS)
instacron_str	Unisinos
institution	Unisinos
reponame_str	Calidoscópio (Online)
collection	Calidoscópio (Online)
repository.name.fl_str_mv	Calidoscópio (Online) - Universidade do Vale do Rio dos Sinos (UNISINOS)
repository.mail.fl_str_mv	cmira@unisinos.br \|\| cmira@unisinos.br
_version_	1792203886641020928

Computational-linguistic processing of lexical bundles: A corpusbased study in the area of Pharmaceutical Regulation

Registros relacionados