Uso de metáforas por aprendizes brasileiros bilíngues de inglês

Detalhes bibliográficos
Autor(a) principal: Boldarine, Amanda Chiarelo
Data de Publicação: 2024
Tipo de documento: Dissertação
Idioma: por
Título da fonte: Biblioteca Digital de Teses e Dissertações da PUC_SP
Texto Completo: https://repositorio.pucsp.br/jspui/handle/handle/42184
Resumo: Research in Learner Corpus Linguistics has traditionally focused mostly on grammatical (e.g., Rankin, 2015) or lexical (e.g., Cobb and Horst, 2015) issues. Consequently, much less is known about how students construct discourse in school tasks. This study aimed to address this gap by investigating the discourses present in written texts by Middle School and First-year High School students in Brazil. To achieve this goal, Lexical Multidimensional Analysis was employed (Berber Sardinha and Fitzsimmons Doolan, 2024) to identify the main discourses in a corpus of 450 texts, amounting to 63,297 words. Lexical Multidimensional Analysis is an off-shoot of Multidimensional Analysis (Biber, 1988), which allows for the detection of the underlying discourses in corpora through the application of multivariate statistical procedures to lexical data extracted from the corpus. However, in the Lexical Multidimensional Analysis literature, little is known about whether metaphorical language is manifested in the dimensions, that is, the extent to which metaphorical language contributes to the dimensionality of texts. Thus, this study aimed to fill this second gap, seeking to determine if the lexical items forming the dimensions are metaphorically used or not. Moreover, in the corpus-based metaphor literature, there has long been a desire to automate metaphor detection, so as to enable the analysis of larger corpora in metaphor studies,, as metaphor analysis in corpora is generally done manually, restricting the amount of data to be analyzed. Given this limitation and the recent availability of Artificial Intelligence through chatbots like ChatGPT, Gemini, and Llama, this study aimed to contribute to text the metaphor detection capabilities of ChatGPT. Eight lexical discursive dimensions were identified through Lexical Multidimensional Analysis. For example, in Dimension 1 "Abstract, theoretical, and scientific knowledge versus Family dynamics, personal resilience, and emotional growth", the discourse in the positive pole emphasizes structured research and scientific analysis, common in academic contexts. On the other hand, the negative pole focused on personal narratives, family relationships, and overcoming personal problems. For the manual annotation of metaphors, the MIP (Pragglejaz Group, 2007) was used, and metaphor candidates were extracted from the variable list of each dimension. At the end of the manual annotation, 245 occurrences (tokens) of candidates were annotated as metaphorical, corresponding to 37% of the candidates (types). Regarding the automatic annotation of metaphors via ChatGPT4, at best one in every four (25%) of the metaphors identified by the human analysts using the MIP protocol (Pragglejaz Group, 2007) was identified automatically through AI. In conclusion, this study contributed to understanding not only the discourses mobilized by students in doing the tasks, but also to how tasks influence the use of linguistic metaphors, as well as to describing the relationship between students' age or proficiency and the incidence of metaphors. Regarding the automation of metaphor identification, the research tried out a range of prompts, charting the process of prompt development, such as restricting the context in which linguistic metaphor occurs. However, the results suggest current LLMs are unable to detect the majority of the metaphors in the corpus
id PUC_SP-1_5b9dad2c7198e2ca74e95902b2c7b9f6
oai_identifier_str oai:repositorio.pucsp.br:handle/42184
network_acronym_str PUC_SP-1
network_name_str Biblioteca Digital de Teses e Dissertações da PUC_SP
repository_id_str
spelling Sardinha, Antonio Paulo Berberhttp://lattes.cnpq.br/6940454346543706http://lattes.cnpq.br/4114529886967103Boldarine, Amanda Chiarelo2024-07-12T14:01:48Z2024-07-12T14:01:48Z2024-06-18Boldarine, Amanda Chiarelo. Uso de metáforas por aprendizes brasileiros bilíngues de inglês. 2024. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2024.https://repositorio.pucsp.br/jspui/handle/handle/42184Research in Learner Corpus Linguistics has traditionally focused mostly on grammatical (e.g., Rankin, 2015) or lexical (e.g., Cobb and Horst, 2015) issues. Consequently, much less is known about how students construct discourse in school tasks. This study aimed to address this gap by investigating the discourses present in written texts by Middle School and First-year High School students in Brazil. To achieve this goal, Lexical Multidimensional Analysis was employed (Berber Sardinha and Fitzsimmons Doolan, 2024) to identify the main discourses in a corpus of 450 texts, amounting to 63,297 words. Lexical Multidimensional Analysis is an off-shoot of Multidimensional Analysis (Biber, 1988), which allows for the detection of the underlying discourses in corpora through the application of multivariate statistical procedures to lexical data extracted from the corpus. However, in the Lexical Multidimensional Analysis literature, little is known about whether metaphorical language is manifested in the dimensions, that is, the extent to which metaphorical language contributes to the dimensionality of texts. Thus, this study aimed to fill this second gap, seeking to determine if the lexical items forming the dimensions are metaphorically used or not. Moreover, in the corpus-based metaphor literature, there has long been a desire to automate metaphor detection, so as to enable the analysis of larger corpora in metaphor studies,, as metaphor analysis in corpora is generally done manually, restricting the amount of data to be analyzed. Given this limitation and the recent availability of Artificial Intelligence through chatbots like ChatGPT, Gemini, and Llama, this study aimed to contribute to text the metaphor detection capabilities of ChatGPT. Eight lexical discursive dimensions were identified through Lexical Multidimensional Analysis. For example, in Dimension 1 "Abstract, theoretical, and scientific knowledge versus Family dynamics, personal resilience, and emotional growth", the discourse in the positive pole emphasizes structured research and scientific analysis, common in academic contexts. On the other hand, the negative pole focused on personal narratives, family relationships, and overcoming personal problems. For the manual annotation of metaphors, the MIP (Pragglejaz Group, 2007) was used, and metaphor candidates were extracted from the variable list of each dimension. At the end of the manual annotation, 245 occurrences (tokens) of candidates were annotated as metaphorical, corresponding to 37% of the candidates (types). Regarding the automatic annotation of metaphors via ChatGPT4, at best one in every four (25%) of the metaphors identified by the human analysts using the MIP protocol (Pragglejaz Group, 2007) was identified automatically through AI. In conclusion, this study contributed to understanding not only the discourses mobilized by students in doing the tasks, but also to how tasks influence the use of linguistic metaphors, as well as to describing the relationship between students' age or proficiency and the incidence of metaphors. Regarding the automation of metaphor identification, the research tried out a range of prompts, charting the process of prompt development, such as restricting the context in which linguistic metaphor occurs. However, the results suggest current LLMs are unable to detect the majority of the metaphors in the corpusA pesquisa em Linguística de Corpus de Aprendiz tem tradicionalmente focado em questões gramaticais (como, Rankin, 2015) ou lexicais (como, Cobb e Horst, 2015), na sua maioria. Desse modo, questões ligadas a como os alunos constroem discursos nas tarefas escolares não têm recebido a atenção devida. Desse modo, o presente estudo pretendeu contribuir para preencher essa lacuna por meio de uma investigação acerca dos discursos presentes em composições escolares de alunos do Ensino Fundamental II e Primeiro Ano do Ensino Médio. Para isso, empregou-se a Análise Multidimensional Lexical (Berber Sardinha e Fitzsimmons Doolan, 2024) para identificar os principais discursos em um corpus de 63.297 palavras. A Análise Multidimensional Lexical é uma abordagem baseada na Análise Multidimensional (Biber, 1988) que permite detectar os discursos prevalecentes em corpora, por meio da aplicação de procedimentos estatísticos multivariados em dados lexicais extraídos do corpus. Na pesquisa em Análise Multidimensional Lexical, existe uma lacuna referente à presença de linguagem metafórica nas dimensões. Mais especificamente, não se sabe até que ponto a linguagem metafórica contribuiu para a dimensionalidade dos textos. Assim, a presente pesquisa pretendeu preencher essa segunda lacuna, referente à relação entre dimensões lexicais e metaforicidade. Por fim, entre os estudos de metáfora baseados em corpora, existe há muito tempo a tentativa de detectar metáforas automaticamente, para expandir o escopo de tamanho dos corpora normalmente usados na pesquisa, já que a análise de metáforas em corpora é geralmente feita manualmente, restringindo a quantidade de dados a serem analisados. Dada essa restrição e a disponibilidade recente da Inteligência Artificial, por meio de chatbots como o ChatGPT, Gemini e Llama, este estudo visou a contribuir para a verificação da capacidade de identificação automática de metáforas com a ajuda do ChatGPT. A partir da Análise Multidimensional Lexical, foram identificadas oito dimensões discursivas lexicais. Por exemplo, na Dimensão 1 “Conhecimento abstrato, teórico e científico versus Dinâmicas familiares, resiliência pessoal e crescimento emociona”, o discurso do polo positivo enfatiza a pesquisa estruturada e a análise científica, comum em contextos acadêmicos. Já o polo negativo focou em narrativas pessoais, relacionamentos familiares e superação de problemas pessoais. Para a anotação manual das metáforas, usou-se o MIP (Pragglejaz Group, 2007) e os candidatos à metáfora foram extraídos da lista de variáveis de cada dimensão. Ao final da anotação manual, 245 ocorrências (tokens) dos candidatos foram anotados como metafóricos, o que corresponde a 37% dos candidatos (types). Quanto à anotação automática das metáforas via ChatGPT4, foi identificada uma em cada quatro ou seis das metáforas identificadas pelos analistas usando o protocolo MIP (Pragglejaz Group, 2007). Concluindo, o presente estudo contribuiu para o entendimento dos discursos mobilizados pelos alunos na execução das tarefas, como a tarefa influencia o uso de metáforas linguísticas, assim como a relação entre a idade ou proficiência dos alunos e a incidência de metáforas. Sobre a automatização da identificação de metáforas, a pesquisa apontou possíveis princípios para desenvolvimento de prompts, como restringir o contexto em que a metáfora linguística ocorre. Contudo, a inconsistência na anotação segue como a principal limitaçãoCoordenação de Aperfeiçoamento de Pessoal de Nível Superior – CAPESporPontifícia Universidade Católica de São PauloPrograma de Pós-Graduação em Linguística Aplicada e Estudos da LinguagemPUC-SPBrasilFaculdade de Filosofia, Comunicação, Letras e ArtesCNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADALinguística de corpusCorpus de aprendizAnálise multidimensional lexicalMetáforaInteligência artificialCorpus linguisticsLearner corporaLexical multidimensional analysisMetaphorArtificial intelligenceUso de metáforas por aprendizes brasileiros bilíngues de inglêsThe use of metaphors by Brazilian bilingual English learnersinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_SPinstname:Pontifícia Universidade Católica de São Paulo (PUC-SP)instacron:PUC_SPORIGINALAmanda Chiarelo Boldarine.pdfapplication/pdf1114816https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/1/Amanda%20Chiarelo%20Boldarine.pdfcad676dde648d1d6e9b3e762e55c277eMD51TEXTAmanda Chiarelo Boldarine.pdf.txtAmanda Chiarelo Boldarine.pdf.txtExtracted texttext/plain310916https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/2/Amanda%20Chiarelo%20Boldarine.pdf.txte93403eb70bd18fb67b2382e6f3e61c0MD52THUMBNAILAmanda Chiarelo Boldarine.pdf.jpgAmanda Chiarelo Boldarine.pdf.jpgGenerated Thumbnailimage/jpeg1247https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/3/Amanda%20Chiarelo%20Boldarine.pdf.jpgec3562b4f2b8fb07ecd006dd847c938dMD53handle/421842024-07-13 01:02:33.515oai:repositorio.pucsp.br:handle/42184Biblioteca Digital de Teses e Dissertaçõeshttps://sapientia.pucsp.br/https://sapientia.pucsp.br/oai/requestbngkatende@pucsp.br||rapassi@pucsp.bropendoar:2024-07-13T04:02:33Biblioteca Digital de Teses e Dissertações da PUC_SP - Pontifícia Universidade Católica de São Paulo (PUC-SP)false
dc.title.pt_BR.fl_str_mv Uso de metáforas por aprendizes brasileiros bilíngues de inglês
dc.title.alternative.en_US.fl_str_mv The use of metaphors by Brazilian bilingual English learners
title Uso de metáforas por aprendizes brasileiros bilíngues de inglês
spellingShingle Uso de metáforas por aprendizes brasileiros bilíngues de inglês
Boldarine, Amanda Chiarelo
CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA
Linguística de corpus
Corpus de aprendiz
Análise multidimensional lexical
Metáfora
Inteligência artificial
Corpus linguistics
Learner corpora
Lexical multidimensional analysis
Metaphor
Artificial intelligence
title_short Uso de metáforas por aprendizes brasileiros bilíngues de inglês
title_full Uso de metáforas por aprendizes brasileiros bilíngues de inglês
title_fullStr Uso de metáforas por aprendizes brasileiros bilíngues de inglês
title_full_unstemmed Uso de metáforas por aprendizes brasileiros bilíngues de inglês
title_sort Uso de metáforas por aprendizes brasileiros bilíngues de inglês
author Boldarine, Amanda Chiarelo
author_facet Boldarine, Amanda Chiarelo
author_role author
dc.contributor.advisor1.fl_str_mv Sardinha, Antonio Paulo Berber
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/6940454346543706
dc.contributor.authorLattes.fl_str_mv http://lattes.cnpq.br/4114529886967103
dc.contributor.author.fl_str_mv Boldarine, Amanda Chiarelo
contributor_str_mv Sardinha, Antonio Paulo Berber
dc.subject.cnpq.fl_str_mv CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA
topic CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA
Linguística de corpus
Corpus de aprendiz
Análise multidimensional lexical
Metáfora
Inteligência artificial
Corpus linguistics
Learner corpora
Lexical multidimensional analysis
Metaphor
Artificial intelligence
dc.subject.por.fl_str_mv Linguística de corpus
Corpus de aprendiz
Análise multidimensional lexical
Metáfora
Inteligência artificial
dc.subject.eng.fl_str_mv Corpus linguistics
Learner corpora
Lexical multidimensional analysis
Metaphor
Artificial intelligence
description Research in Learner Corpus Linguistics has traditionally focused mostly on grammatical (e.g., Rankin, 2015) or lexical (e.g., Cobb and Horst, 2015) issues. Consequently, much less is known about how students construct discourse in school tasks. This study aimed to address this gap by investigating the discourses present in written texts by Middle School and First-year High School students in Brazil. To achieve this goal, Lexical Multidimensional Analysis was employed (Berber Sardinha and Fitzsimmons Doolan, 2024) to identify the main discourses in a corpus of 450 texts, amounting to 63,297 words. Lexical Multidimensional Analysis is an off-shoot of Multidimensional Analysis (Biber, 1988), which allows for the detection of the underlying discourses in corpora through the application of multivariate statistical procedures to lexical data extracted from the corpus. However, in the Lexical Multidimensional Analysis literature, little is known about whether metaphorical language is manifested in the dimensions, that is, the extent to which metaphorical language contributes to the dimensionality of texts. Thus, this study aimed to fill this second gap, seeking to determine if the lexical items forming the dimensions are metaphorically used or not. Moreover, in the corpus-based metaphor literature, there has long been a desire to automate metaphor detection, so as to enable the analysis of larger corpora in metaphor studies,, as metaphor analysis in corpora is generally done manually, restricting the amount of data to be analyzed. Given this limitation and the recent availability of Artificial Intelligence through chatbots like ChatGPT, Gemini, and Llama, this study aimed to contribute to text the metaphor detection capabilities of ChatGPT. Eight lexical discursive dimensions were identified through Lexical Multidimensional Analysis. For example, in Dimension 1 "Abstract, theoretical, and scientific knowledge versus Family dynamics, personal resilience, and emotional growth", the discourse in the positive pole emphasizes structured research and scientific analysis, common in academic contexts. On the other hand, the negative pole focused on personal narratives, family relationships, and overcoming personal problems. For the manual annotation of metaphors, the MIP (Pragglejaz Group, 2007) was used, and metaphor candidates were extracted from the variable list of each dimension. At the end of the manual annotation, 245 occurrences (tokens) of candidates were annotated as metaphorical, corresponding to 37% of the candidates (types). Regarding the automatic annotation of metaphors via ChatGPT4, at best one in every four (25%) of the metaphors identified by the human analysts using the MIP protocol (Pragglejaz Group, 2007) was identified automatically through AI. In conclusion, this study contributed to understanding not only the discourses mobilized by students in doing the tasks, but also to how tasks influence the use of linguistic metaphors, as well as to describing the relationship between students' age or proficiency and the incidence of metaphors. Regarding the automation of metaphor identification, the research tried out a range of prompts, charting the process of prompt development, such as restricting the context in which linguistic metaphor occurs. However, the results suggest current LLMs are unable to detect the majority of the metaphors in the corpus
publishDate 2024
dc.date.accessioned.fl_str_mv 2024-07-12T14:01:48Z
dc.date.available.fl_str_mv 2024-07-12T14:01:48Z
dc.date.issued.fl_str_mv 2024-06-18
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv Boldarine, Amanda Chiarelo. Uso de metáforas por aprendizes brasileiros bilíngues de inglês. 2024. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2024.
dc.identifier.uri.fl_str_mv https://repositorio.pucsp.br/jspui/handle/handle/42184
identifier_str_mv Boldarine, Amanda Chiarelo. Uso de metáforas por aprendizes brasileiros bilíngues de inglês. 2024. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2024.
url https://repositorio.pucsp.br/jspui/handle/handle/42184
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Pontifícia Universidade Católica de São Paulo
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem
dc.publisher.initials.fl_str_mv PUC-SP
dc.publisher.country.fl_str_mv Brasil
dc.publisher.department.fl_str_mv Faculdade de Filosofia, Comunicação, Letras e Artes
publisher.none.fl_str_mv Pontifícia Universidade Católica de São Paulo
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da PUC_SP
instname:Pontifícia Universidade Católica de São Paulo (PUC-SP)
instacron:PUC_SP
instname_str Pontifícia Universidade Católica de São Paulo (PUC-SP)
instacron_str PUC_SP
institution PUC_SP
reponame_str Biblioteca Digital de Teses e Dissertações da PUC_SP
collection Biblioteca Digital de Teses e Dissertações da PUC_SP
bitstream.url.fl_str_mv https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/1/Amanda%20Chiarelo%20Boldarine.pdf
https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/2/Amanda%20Chiarelo%20Boldarine.pdf.txt
https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/3/Amanda%20Chiarelo%20Boldarine.pdf.jpg
bitstream.checksum.fl_str_mv cad676dde648d1d6e9b3e762e55c277e
e93403eb70bd18fb67b2382e6f3e61c0
ec3562b4f2b8fb07ecd006dd847c938d
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da PUC_SP - Pontifícia Universidade Católica de São Paulo (PUC-SP)
repository.mail.fl_str_mv bngkatende@pucsp.br||rapassi@pucsp.br
_version_ 1809277908192067584