Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica

Marquiafável, Vanessa Silva

Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica

Detalhes bibliográficos
Autor(a) principal:	Marquiafável, Vanessa Silva
Data de Publicação:	2007
Tipo de documento:	Dissertação
Idioma:	por
Título da fonte:	Repositório Institucional da UFSCAR
Texto Completo:	https://repositorio.ufscar.br/handle/ufscar/5647
Resumo:	Within the context of academic research, English is the lingua franca for various scientific disciplines. It is also widely acknowledged that producing an acceptable academic text is anything but a simple and easy task. This is particularly more acute if the author is a novice researcher and English is not his/her first language. One possible solution to minimize this difficulty is the use of writing tools to assist novice researchers during different stages of the writing process. This could involve, for instance, quick and easy access to a collection of authentic linguistic resources extracted from published scientific papers. AMADEUS (Amiable Article Development for User Support) and SciPo (Scientific Portuguese) are good examples of this type of writing tools. AMADEUS is a resource which was designed to help non-native English users write academic texts. It focuses on the fields of Physics and Computer Science specifically. SciPo is a Web critiquing system for writing theses in Portuguese and focuses on the discipline of Computer Science. A variation of Scipo is SciPo- Farmácia, which is a web-based tool to assist non-native speakers of English in writing scientific papers in the field of Pharmaceutical Sciences. The main purpose of this dissertation is to elaborate a semi-automatic process to generate the necessary English linguistic resources required by supporting writing tools, such as the ones mentioned above. The primary aim is to enable researchers from various disciplines to develop their own aiding writing tool, customized to his/her specific field, with no need to refer to linguists, computer scientists and/or academic writing specialists for help. The semi-automatic process proposed here has been designed to include the knowledge which would be provided by these specialists. The main methodology adopted in this research derives from the discipline of Corpus Linguistics (we have used both corpus-based and corpus-driven approaches). This choice relies on the assumption that the success of such tools is strongly related to the corpus from which users collect well-written text extracts so that they can be recycled and reused in the text being produced. The semi-automatic process was evaluated in two ways: i) clearness and completeness of the manuals describing the linguistic resources and ii) quality of the linguistic resources generated and estimated time for developing all the necessary linguistic resources. For measuring the quality of the two evaluation stages, we have used the statistical system Kappa. The results ranged from k=0.72 e k=1.0. These figures can be interpreted as a good understanding of the tasks described in the manuals evaluated. The present research proves relevant in a number of aspects. It opens up the possibility of generating a computational tool to assist non-native English speakers in writing academic texts in any experimental field, by using the knowledge from the semiautomatic process only. It also promotes the use of supporting writing tools as didactic resource for teaching-learning scientific English and the use of metrics to evaluate rhetorical structure models. Last but not least, it produces a rhetorically annotated corpus which may be used for teaching-learning purposes or in natural language processing.

Metadados do item

id	SCAR_0f5c9bdfd9bd2194dd84509f300939a5
oai_identifier_str	oai:repositorio.ufscar.br:ufscar/5647
network_acronym_str	SCAR
network_name_str	Repositório Institucional da UFSCAR
repository_id_str	4322
spelling	Marquiafável, Vanessa SilvaAluisio, Sandra Mariahttp://lattes.cnpq.br/4793072701914550http://lattes.cnpq.br/7084417567970607a9e1b556-7d86-4174-9552-41516db1afbe2016-06-02T20:24:59Z2007-05-142016-06-02T20:24:59Z2007-03-29MARQUIAFÁVEL, Vanessa Silva. Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica.. 2007. 276 f. Dissertação (Mestrado em Ciências Humanas) - Universidade Federal de São Carlos, São Carlos, 2007.https://repositorio.ufscar.br/handle/ufscar/5647Within the context of academic research, English is the lingua franca for various scientific disciplines. It is also widely acknowledged that producing an acceptable academic text is anything but a simple and easy task. This is particularly more acute if the author is a novice researcher and English is not his/her first language. One possible solution to minimize this difficulty is the use of writing tools to assist novice researchers during different stages of the writing process. This could involve, for instance, quick and easy access to a collection of authentic linguistic resources extracted from published scientific papers. AMADEUS (Amiable Article Development for User Support) and SciPo (Scientific Portuguese) are good examples of this type of writing tools. AMADEUS is a resource which was designed to help non-native English users write academic texts. It focuses on the fields of Physics and Computer Science specifically. SciPo is a Web critiquing system for writing theses in Portuguese and focuses on the discipline of Computer Science. A variation of Scipo is SciPo- Farmácia, which is a web-based tool to assist non-native speakers of English in writing scientific papers in the field of Pharmaceutical Sciences. The main purpose of this dissertation is to elaborate a semi-automatic process to generate the necessary English linguistic resources required by supporting writing tools, such as the ones mentioned above. The primary aim is to enable researchers from various disciplines to develop their own aiding writing tool, customized to his/her specific field, with no need to refer to linguists, computer scientists and/or academic writing specialists for help. The semi-automatic process proposed here has been designed to include the knowledge which would be provided by these specialists. The main methodology adopted in this research derives from the discipline of Corpus Linguistics (we have used both corpus-based and corpus-driven approaches). This choice relies on the assumption that the success of such tools is strongly related to the corpus from which users collect well-written text extracts so that they can be recycled and reused in the text being produced. The semi-automatic process was evaluated in two ways: i) clearness and completeness of the manuals describing the linguistic resources and ii) quality of the linguistic resources generated and estimated time for developing all the necessary linguistic resources. For measuring the quality of the two evaluation stages, we have used the statistical system Kappa. The results ranged from k=0.72 e k=1.0. These figures can be interpreted as a good understanding of the tasks described in the manuals evaluated. The present research proves relevant in a number of aspects. It opens up the possibility of generating a computational tool to assist non-native English speakers in writing academic texts in any experimental field, by using the knowledge from the semiautomatic process only. It also promotes the use of supporting writing tools as didactic resource for teaching-learning scientific English and the use of metrics to evaluate rhetorical structure models. Last but not least, it produces a rhetorically annotated corpus which may be used for teaching-learning purposes or in natural language processing.No ambiente acadêmico atual, a língua inglesa foi escolhida como a lingua franca da ciência nas mais variadas áreas do conhecimento. No entanto, sabe-se que a tarefa de produção de um texto científico adequado, no caso o artigo científico, não é fácil, principalmente se o escritor é iniciante nessa atividade e sua língua materna não é o inglês. Uma alternativa para esse problema é a utilização de ferramentas computacionais que apóiam as diferentes etapas do processo de escrita de um texto científico, cuja base seja formada por material lingüístico autêntico coletado de artigos científicos publicados e indexados de forma a facilitar seu rápido acesso. Dentre essas ferramentas, destacamos três em especial: o AMADEUS (Amiable Article Development for User Support), que apóia a escrita de artigos científicos em inglês nas áreas de Física e Computação, o SciPo, inspirado no AMADEUS, mas que apóia a escrita de teses e dissertações em português na área de Ciências da Computação e o SciPo-Farmácia, que dá suporte à escrita de artigos científicos em inglês na área de Ciências Farmacêuticas. O objetivo principal deste projeto de pesquisa foi formalizar um processo para a construção de recursos lingüísticos em inglês a serem usados em ferramentas de suporte à escrita científica semelhantes ao SciPo-Farmácia. A principal metodologia utilizada derivou da Lingüística de Corpus (usamos tanto a abordagem dirigida por córpus quanto baseada em córpus), pois a eficácia das ferramentas citadas, conforme experiências relatadas por seus desenvolvedores, está fortemente ligada ao fato de possuírem um córpus com textos da área de pesquisa do pesquisador-escritor, a partir do qual reutilizamse trechos bem-escritos na escrita de um novo artigo. A avaliação do processo aqui proposto se deu em dois momentos: i) na avaliação da clareza e da completude dos manuais de construção de recursos lingüísticos, e ii) na avaliação da qualidade dos recursos lingüísticos produzidos e elaboração de uma estimativa do tempo gasto na construção dos recursos lingüísticos descritos por esses módulos. A estatística Kappa foi escolhida para medir a qualidade do material produzido nas duas etapas, a qual indicou valores entre k=0.72 e k=1,0. Esses bons resultados podem ser atribuídos ao entendimento do conteúdo dos manuais utilizados na avaliação das tarefas contidas no processo proposto. Dentre as contribuições desta pesquisa podemos citar: a possibilidade de construção de recursos lingüísticos para gerar uma ferramenta de suporte à escrita científica em inglês para várias áreas que possuem a pesquisa experimental como foco, utilizando apenas as informações contidas no processo proposto; o auxilio na divulgação, via Web, de ferramentas computacionais de suporte à escrita enquanto recurso didático a ser utilizado no ensino-aprendizado de inglês científico; a divulgação de métricas para avaliação de modelos de estruturas esquemáticas propostas; e disponibilização de córpus anotados em nível retórico para serem usados em ferramentas de processamento de língua natural ou ensino.Financiadora de Estudos e Projetosapplication/pdfporUniversidade Federal de São CarlosPrograma de Pós-Graduação em Linguística - PPGLUFSCarBRLingüística processamento de dadosLingüística de corpusLíngua inglesa - ensinoFerramenta de apoio à escrita científicaGênero textualLINGUISTICA, LETRAS E ARTES::LINGUISTICAUm processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científicainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis-1-1d495e8f0-bd4c-4b31-a4e1-23d149ea3fc9info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARORIGINALDissVSM.pdfapplication/pdf4419849https://repositorio.ufscar.br/bitstream/ufscar/5647/1/DissVSM.pdfa1cef53968c5829a753427d39a957525MD51THUMBNAILDissVSM.pdf.jpgDissVSM.pdf.jpgIM Thumbnailimage/jpeg12028https://repositorio.ufscar.br/bitstream/ufscar/5647/2/DissVSM.pdf.jpg379401698080467433a235d447081911MD52ufscar/56472023-09-18 18:31:08.6oai:repositorio.ufscar.br:ufscar/5647Repositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestopendoar:43222023-09-18T18:31:08Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.por.fl_str_mv	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
title	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
spellingShingle	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica Marquiafável, Vanessa Silva Lingüística processamento de dados Lingüística de corpus Língua inglesa - ensino Ferramenta de apoio à escrita científica Gênero textual LINGUISTICA, LETRAS E ARTES::LINGUISTICA
title_short	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
title_full	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
title_fullStr	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
title_full_unstemmed	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
title_sort	Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica
author	Marquiafável, Vanessa Silva
author_facet	Marquiafável, Vanessa Silva
author_role	author
dc.contributor.authorlattes.por.fl_str_mv	http://lattes.cnpq.br/7084417567970607
dc.contributor.author.fl_str_mv	Marquiafável, Vanessa Silva
dc.contributor.advisor1.fl_str_mv	Aluisio, Sandra Maria
dc.contributor.advisor1Lattes.fl_str_mv	http://lattes.cnpq.br/4793072701914550
dc.contributor.authorID.fl_str_mv	a9e1b556-7d86-4174-9552-41516db1afbe
contributor_str_mv	Aluisio, Sandra Maria
dc.subject.por.fl_str_mv	Lingüística processamento de dados Lingüística de corpus Língua inglesa - ensino Ferramenta de apoio à escrita científica Gênero textual
topic	Lingüística processamento de dados Lingüística de corpus Língua inglesa - ensino Ferramenta de apoio à escrita científica Gênero textual LINGUISTICA, LETRAS E ARTES::LINGUISTICA
dc.subject.cnpq.fl_str_mv	LINGUISTICA, LETRAS E ARTES::LINGUISTICA
description	Within the context of academic research, English is the lingua franca for various scientific disciplines. It is also widely acknowledged that producing an acceptable academic text is anything but a simple and easy task. This is particularly more acute if the author is a novice researcher and English is not his/her first language. One possible solution to minimize this difficulty is the use of writing tools to assist novice researchers during different stages of the writing process. This could involve, for instance, quick and easy access to a collection of authentic linguistic resources extracted from published scientific papers. AMADEUS (Amiable Article Development for User Support) and SciPo (Scientific Portuguese) are good examples of this type of writing tools. AMADEUS is a resource which was designed to help non-native English users write academic texts. It focuses on the fields of Physics and Computer Science specifically. SciPo is a Web critiquing system for writing theses in Portuguese and focuses on the discipline of Computer Science. A variation of Scipo is SciPo- Farmácia, which is a web-based tool to assist non-native speakers of English in writing scientific papers in the field of Pharmaceutical Sciences. The main purpose of this dissertation is to elaborate a semi-automatic process to generate the necessary English linguistic resources required by supporting writing tools, such as the ones mentioned above. The primary aim is to enable researchers from various disciplines to develop their own aiding writing tool, customized to his/her specific field, with no need to refer to linguists, computer scientists and/or academic writing specialists for help. The semi-automatic process proposed here has been designed to include the knowledge which would be provided by these specialists. The main methodology adopted in this research derives from the discipline of Corpus Linguistics (we have used both corpus-based and corpus-driven approaches). This choice relies on the assumption that the success of such tools is strongly related to the corpus from which users collect well-written text extracts so that they can be recycled and reused in the text being produced. The semi-automatic process was evaluated in two ways: i) clearness and completeness of the manuals describing the linguistic resources and ii) quality of the linguistic resources generated and estimated time for developing all the necessary linguistic resources. For measuring the quality of the two evaluation stages, we have used the statistical system Kappa. The results ranged from k=0.72 e k=1.0. These figures can be interpreted as a good understanding of the tasks described in the manuals evaluated. The present research proves relevant in a number of aspects. It opens up the possibility of generating a computational tool to assist non-native English speakers in writing academic texts in any experimental field, by using the knowledge from the semiautomatic process only. It also promotes the use of supporting writing tools as didactic resource for teaching-learning scientific English and the use of metrics to evaluate rhetorical structure models. Last but not least, it produces a rhetorically annotated corpus which may be used for teaching-learning purposes or in natural language processing.
publishDate	2007
dc.date.available.fl_str_mv	2007-05-14 2016-06-02T20:24:59Z
dc.date.issued.fl_str_mv	2007-03-29
dc.date.accessioned.fl_str_mv	2016-06-02T20:24:59Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	MARQUIAFÁVEL, Vanessa Silva. Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica.. 2007. 276 f. Dissertação (Mestrado em Ciências Humanas) - Universidade Federal de São Carlos, São Carlos, 2007.
dc.identifier.uri.fl_str_mv	https://repositorio.ufscar.br/handle/ufscar/5647
identifier_str_mv	MARQUIAFÁVEL, Vanessa Silva. Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica.. 2007. 276 f. Dissertação (Mestrado em Ciências Humanas) - Universidade Federal de São Carlos, São Carlos, 2007.
url	https://repositorio.ufscar.br/handle/ufscar/5647
dc.language.iso.fl_str_mv	por
language	por
dc.relation.confidence.fl_str_mv	-1 -1
dc.relation.authority.fl_str_mv	d495e8f0-bd4c-4b31-a4e1-23d149ea3fc9
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidade Federal de São Carlos
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Linguística - PPGL
dc.publisher.initials.fl_str_mv	UFSCar
dc.publisher.country.fl_str_mv	BR
publisher.none.fl_str_mv	Universidade Federal de São Carlos
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UFSCAR instname:Universidade Federal de São Carlos (UFSCAR) instacron:UFSCAR
instname_str	Universidade Federal de São Carlos (UFSCAR)
instacron_str	UFSCAR
institution	UFSCAR
reponame_str	Repositório Institucional da UFSCAR
collection	Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv	https://repositorio.ufscar.br/bitstream/ufscar/5647/1/DissVSM.pdf https://repositorio.ufscar.br/bitstream/ufscar/5647/2/DissVSM.pdf.jpg
bitstream.checksum.fl_str_mv	a1cef53968c5829a753427d39a957525 379401698080467433a235d447081911
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5
repository.name.fl_str_mv	Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv
_version_	1813715545473679360

Um processo para a geração de recursos lingüísticos aplicáveis em ferramentas de auxílio à escrita científica

Registros relacionados