Uso de metáforas por aprendizes brasileiros bilíngues de inglês
Autor(a) principal: | |
---|---|
Data de Publicação: | 2024 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da PUC_SP |
Texto Completo: | https://repositorio.pucsp.br/jspui/handle/handle/42184 |
Resumo: | Research in Learner Corpus Linguistics has traditionally focused mostly on grammatical (e.g., Rankin, 2015) or lexical (e.g., Cobb and Horst, 2015) issues. Consequently, much less is known about how students construct discourse in school tasks. This study aimed to address this gap by investigating the discourses present in written texts by Middle School and First-year High School students in Brazil. To achieve this goal, Lexical Multidimensional Analysis was employed (Berber Sardinha and Fitzsimmons Doolan, 2024) to identify the main discourses in a corpus of 450 texts, amounting to 63,297 words. Lexical Multidimensional Analysis is an off-shoot of Multidimensional Analysis (Biber, 1988), which allows for the detection of the underlying discourses in corpora through the application of multivariate statistical procedures to lexical data extracted from the corpus. However, in the Lexical Multidimensional Analysis literature, little is known about whether metaphorical language is manifested in the dimensions, that is, the extent to which metaphorical language contributes to the dimensionality of texts. Thus, this study aimed to fill this second gap, seeking to determine if the lexical items forming the dimensions are metaphorically used or not. Moreover, in the corpus-based metaphor literature, there has long been a desire to automate metaphor detection, so as to enable the analysis of larger corpora in metaphor studies,, as metaphor analysis in corpora is generally done manually, restricting the amount of data to be analyzed. Given this limitation and the recent availability of Artificial Intelligence through chatbots like ChatGPT, Gemini, and Llama, this study aimed to contribute to text the metaphor detection capabilities of ChatGPT. Eight lexical discursive dimensions were identified through Lexical Multidimensional Analysis. For example, in Dimension 1 "Abstract, theoretical, and scientific knowledge versus Family dynamics, personal resilience, and emotional growth", the discourse in the positive pole emphasizes structured research and scientific analysis, common in academic contexts. On the other hand, the negative pole focused on personal narratives, family relationships, and overcoming personal problems. For the manual annotation of metaphors, the MIP (Pragglejaz Group, 2007) was used, and metaphor candidates were extracted from the variable list of each dimension. At the end of the manual annotation, 245 occurrences (tokens) of candidates were annotated as metaphorical, corresponding to 37% of the candidates (types). Regarding the automatic annotation of metaphors via ChatGPT4, at best one in every four (25%) of the metaphors identified by the human analysts using the MIP protocol (Pragglejaz Group, 2007) was identified automatically through AI. In conclusion, this study contributed to understanding not only the discourses mobilized by students in doing the tasks, but also to how tasks influence the use of linguistic metaphors, as well as to describing the relationship between students' age or proficiency and the incidence of metaphors. Regarding the automation of metaphor identification, the research tried out a range of prompts, charting the process of prompt development, such as restricting the context in which linguistic metaphor occurs. However, the results suggest current LLMs are unable to detect the majority of the metaphors in the corpus |
id |
PUC_SP-1_5b9dad2c7198e2ca74e95902b2c7b9f6 |
---|---|
oai_identifier_str |
oai:repositorio.pucsp.br:handle/42184 |
network_acronym_str |
PUC_SP-1 |
network_name_str |
Biblioteca Digital de Teses e Dissertações da PUC_SP |
repository_id_str |
|
spelling |
Sardinha, Antonio Paulo Berberhttp://lattes.cnpq.br/6940454346543706http://lattes.cnpq.br/4114529886967103Boldarine, Amanda Chiarelo2024-07-12T14:01:48Z2024-07-12T14:01:48Z2024-06-18Boldarine, Amanda Chiarelo. Uso de metáforas por aprendizes brasileiros bilíngues de inglês. 2024. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2024.https://repositorio.pucsp.br/jspui/handle/handle/42184Research in Learner Corpus Linguistics has traditionally focused mostly on grammatical (e.g., Rankin, 2015) or lexical (e.g., Cobb and Horst, 2015) issues. Consequently, much less is known about how students construct discourse in school tasks. This study aimed to address this gap by investigating the discourses present in written texts by Middle School and First-year High School students in Brazil. To achieve this goal, Lexical Multidimensional Analysis was employed (Berber Sardinha and Fitzsimmons Doolan, 2024) to identify the main discourses in a corpus of 450 texts, amounting to 63,297 words. Lexical Multidimensional Analysis is an off-shoot of Multidimensional Analysis (Biber, 1988), which allows for the detection of the underlying discourses in corpora through the application of multivariate statistical procedures to lexical data extracted from the corpus. However, in the Lexical Multidimensional Analysis literature, little is known about whether metaphorical language is manifested in the dimensions, that is, the extent to which metaphorical language contributes to the dimensionality of texts. Thus, this study aimed to fill this second gap, seeking to determine if the lexical items forming the dimensions are metaphorically used or not. Moreover, in the corpus-based metaphor literature, there has long been a desire to automate metaphor detection, so as to enable the analysis of larger corpora in metaphor studies,, as metaphor analysis in corpora is generally done manually, restricting the amount of data to be analyzed. Given this limitation and the recent availability of Artificial Intelligence through chatbots like ChatGPT, Gemini, and Llama, this study aimed to contribute to text the metaphor detection capabilities of ChatGPT. Eight lexical discursive dimensions were identified through Lexical Multidimensional Analysis. For example, in Dimension 1 "Abstract, theoretical, and scientific knowledge versus Family dynamics, personal resilience, and emotional growth", the discourse in the positive pole emphasizes structured research and scientific analysis, common in academic contexts. On the other hand, the negative pole focused on personal narratives, family relationships, and overcoming personal problems. For the manual annotation of metaphors, the MIP (Pragglejaz Group, 2007) was used, and metaphor candidates were extracted from the variable list of each dimension. At the end of the manual annotation, 245 occurrences (tokens) of candidates were annotated as metaphorical, corresponding to 37% of the candidates (types). Regarding the automatic annotation of metaphors via ChatGPT4, at best one in every four (25%) of the metaphors identified by the human analysts using the MIP protocol (Pragglejaz Group, 2007) was identified automatically through AI. In conclusion, this study contributed to understanding not only the discourses mobilized by students in doing the tasks, but also to how tasks influence the use of linguistic metaphors, as well as to describing the relationship between students' age or proficiency and the incidence of metaphors. Regarding the automation of metaphor identification, the research tried out a range of prompts, charting the process of prompt development, such as restricting the context in which linguistic metaphor occurs. However, the results suggest current LLMs are unable to detect the majority of the metaphors in the corpusA pesquisa em Linguística de Corpus de Aprendiz tem tradicionalmente focado em questões gramaticais (como, Rankin, 2015) ou lexicais (como, Cobb e Horst, 2015), na sua maioria. Desse modo, questões ligadas a como os alunos constroem discursos nas tarefas escolares não têm recebido a atenção devida. Desse modo, o presente estudo pretendeu contribuir para preencher essa lacuna por meio de uma investigação acerca dos discursos presentes em composições escolares de alunos do Ensino Fundamental II e Primeiro Ano do Ensino Médio. Para isso, empregou-se a Análise Multidimensional Lexical (Berber Sardinha e Fitzsimmons Doolan, 2024) para identificar os principais discursos em um corpus de 63.297 palavras. A Análise Multidimensional Lexical é uma abordagem baseada na Análise Multidimensional (Biber, 1988) que permite detectar os discursos prevalecentes em corpora, por meio da aplicação de procedimentos estatísticos multivariados em dados lexicais extraídos do corpus. Na pesquisa em Análise Multidimensional Lexical, existe uma lacuna referente à presença de linguagem metafórica nas dimensões. Mais especificamente, não se sabe até que ponto a linguagem metafórica contribuiu para a dimensionalidade dos textos. Assim, a presente pesquisa pretendeu preencher essa segunda lacuna, referente à relação entre dimensões lexicais e metaforicidade. Por fim, entre os estudos de metáfora baseados em corpora, existe há muito tempo a tentativa de detectar metáforas automaticamente, para expandir o escopo de tamanho dos corpora normalmente usados na pesquisa, já que a análise de metáforas em corpora é geralmente feita manualmente, restringindo a quantidade de dados a serem analisados. Dada essa restrição e a disponibilidade recente da Inteligência Artificial, por meio de chatbots como o ChatGPT, Gemini e Llama, este estudo visou a contribuir para a verificação da capacidade de identificação automática de metáforas com a ajuda do ChatGPT. A partir da Análise Multidimensional Lexical, foram identificadas oito dimensões discursivas lexicais. Por exemplo, na Dimensão 1 “Conhecimento abstrato, teórico e científico versus Dinâmicas familiares, resiliência pessoal e crescimento emociona”, o discurso do polo positivo enfatiza a pesquisa estruturada e a análise científica, comum em contextos acadêmicos. Já o polo negativo focou em narrativas pessoais, relacionamentos familiares e superação de problemas pessoais. Para a anotação manual das metáforas, usou-se o MIP (Pragglejaz Group, 2007) e os candidatos à metáfora foram extraídos da lista de variáveis de cada dimensão. Ao final da anotação manual, 245 ocorrências (tokens) dos candidatos foram anotados como metafóricos, o que corresponde a 37% dos candidatos (types). Quanto à anotação automática das metáforas via ChatGPT4, foi identificada uma em cada quatro ou seis das metáforas identificadas pelos analistas usando o protocolo MIP (Pragglejaz Group, 2007). Concluindo, o presente estudo contribuiu para o entendimento dos discursos mobilizados pelos alunos na execução das tarefas, como a tarefa influencia o uso de metáforas linguísticas, assim como a relação entre a idade ou proficiência dos alunos e a incidência de metáforas. Sobre a automatização da identificação de metáforas, a pesquisa apontou possíveis princípios para desenvolvimento de prompts, como restringir o contexto em que a metáfora linguística ocorre. Contudo, a inconsistência na anotação segue como a principal limitaçãoCoordenação de Aperfeiçoamento de Pessoal de Nível Superior – CAPESporPontifícia Universidade Católica de São PauloPrograma de Pós-Graduação em Linguística Aplicada e Estudos da LinguagemPUC-SPBrasilFaculdade de Filosofia, Comunicação, Letras e ArtesCNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADALinguística de corpusCorpus de aprendizAnálise multidimensional lexicalMetáforaInteligência artificialCorpus linguisticsLearner corporaLexical multidimensional analysisMetaphorArtificial intelligenceUso de metáforas por aprendizes brasileiros bilíngues de inglêsThe use of metaphors by Brazilian bilingual English learnersinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_SPinstname:Pontifícia Universidade Católica de São Paulo (PUC-SP)instacron:PUC_SPORIGINALAmanda Chiarelo Boldarine.pdfapplication/pdf1114816https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/1/Amanda%20Chiarelo%20Boldarine.pdfcad676dde648d1d6e9b3e762e55c277eMD51TEXTAmanda Chiarelo Boldarine.pdf.txtAmanda Chiarelo Boldarine.pdf.txtExtracted texttext/plain310916https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/2/Amanda%20Chiarelo%20Boldarine.pdf.txte93403eb70bd18fb67b2382e6f3e61c0MD52THUMBNAILAmanda Chiarelo Boldarine.pdf.jpgAmanda Chiarelo Boldarine.pdf.jpgGenerated Thumbnailimage/jpeg1247https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/3/Amanda%20Chiarelo%20Boldarine.pdf.jpgec3562b4f2b8fb07ecd006dd847c938dMD53handle/421842024-07-13 01:02:33.515oai:repositorio.pucsp.br:handle/42184Biblioteca Digital de Teses e Dissertaçõeshttps://sapientia.pucsp.br/https://sapientia.pucsp.br/oai/requestbngkatende@pucsp.br||rapassi@pucsp.bropendoar:2024-07-13T04:02:33Biblioteca Digital de Teses e Dissertações da PUC_SP - Pontifícia Universidade Católica de São Paulo (PUC-SP)false |
dc.title.pt_BR.fl_str_mv |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
dc.title.alternative.en_US.fl_str_mv |
The use of metaphors by Brazilian bilingual English learners |
title |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
spellingShingle |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês Boldarine, Amanda Chiarelo CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA Linguística de corpus Corpus de aprendiz Análise multidimensional lexical Metáfora Inteligência artificial Corpus linguistics Learner corpora Lexical multidimensional analysis Metaphor Artificial intelligence |
title_short |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
title_full |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
title_fullStr |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
title_full_unstemmed |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
title_sort |
Uso de metáforas por aprendizes brasileiros bilíngues de inglês |
author |
Boldarine, Amanda Chiarelo |
author_facet |
Boldarine, Amanda Chiarelo |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Sardinha, Antonio Paulo Berber |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/6940454346543706 |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/4114529886967103 |
dc.contributor.author.fl_str_mv |
Boldarine, Amanda Chiarelo |
contributor_str_mv |
Sardinha, Antonio Paulo Berber |
dc.subject.cnpq.fl_str_mv |
CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA |
topic |
CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA Linguística de corpus Corpus de aprendiz Análise multidimensional lexical Metáfora Inteligência artificial Corpus linguistics Learner corpora Lexical multidimensional analysis Metaphor Artificial intelligence |
dc.subject.por.fl_str_mv |
Linguística de corpus Corpus de aprendiz Análise multidimensional lexical Metáfora Inteligência artificial |
dc.subject.eng.fl_str_mv |
Corpus linguistics Learner corpora Lexical multidimensional analysis Metaphor Artificial intelligence |
description |
Research in Learner Corpus Linguistics has traditionally focused mostly on grammatical (e.g., Rankin, 2015) or lexical (e.g., Cobb and Horst, 2015) issues. Consequently, much less is known about how students construct discourse in school tasks. This study aimed to address this gap by investigating the discourses present in written texts by Middle School and First-year High School students in Brazil. To achieve this goal, Lexical Multidimensional Analysis was employed (Berber Sardinha and Fitzsimmons Doolan, 2024) to identify the main discourses in a corpus of 450 texts, amounting to 63,297 words. Lexical Multidimensional Analysis is an off-shoot of Multidimensional Analysis (Biber, 1988), which allows for the detection of the underlying discourses in corpora through the application of multivariate statistical procedures to lexical data extracted from the corpus. However, in the Lexical Multidimensional Analysis literature, little is known about whether metaphorical language is manifested in the dimensions, that is, the extent to which metaphorical language contributes to the dimensionality of texts. Thus, this study aimed to fill this second gap, seeking to determine if the lexical items forming the dimensions are metaphorically used or not. Moreover, in the corpus-based metaphor literature, there has long been a desire to automate metaphor detection, so as to enable the analysis of larger corpora in metaphor studies,, as metaphor analysis in corpora is generally done manually, restricting the amount of data to be analyzed. Given this limitation and the recent availability of Artificial Intelligence through chatbots like ChatGPT, Gemini, and Llama, this study aimed to contribute to text the metaphor detection capabilities of ChatGPT. Eight lexical discursive dimensions were identified through Lexical Multidimensional Analysis. For example, in Dimension 1 "Abstract, theoretical, and scientific knowledge versus Family dynamics, personal resilience, and emotional growth", the discourse in the positive pole emphasizes structured research and scientific analysis, common in academic contexts. On the other hand, the negative pole focused on personal narratives, family relationships, and overcoming personal problems. For the manual annotation of metaphors, the MIP (Pragglejaz Group, 2007) was used, and metaphor candidates were extracted from the variable list of each dimension. At the end of the manual annotation, 245 occurrences (tokens) of candidates were annotated as metaphorical, corresponding to 37% of the candidates (types). Regarding the automatic annotation of metaphors via ChatGPT4, at best one in every four (25%) of the metaphors identified by the human analysts using the MIP protocol (Pragglejaz Group, 2007) was identified automatically through AI. In conclusion, this study contributed to understanding not only the discourses mobilized by students in doing the tasks, but also to how tasks influence the use of linguistic metaphors, as well as to describing the relationship between students' age or proficiency and the incidence of metaphors. Regarding the automation of metaphor identification, the research tried out a range of prompts, charting the process of prompt development, such as restricting the context in which linguistic metaphor occurs. However, the results suggest current LLMs are unable to detect the majority of the metaphors in the corpus |
publishDate |
2024 |
dc.date.accessioned.fl_str_mv |
2024-07-12T14:01:48Z |
dc.date.available.fl_str_mv |
2024-07-12T14:01:48Z |
dc.date.issued.fl_str_mv |
2024-06-18 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
Boldarine, Amanda Chiarelo. Uso de metáforas por aprendizes brasileiros bilíngues de inglês. 2024. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2024. |
dc.identifier.uri.fl_str_mv |
https://repositorio.pucsp.br/jspui/handle/handle/42184 |
identifier_str_mv |
Boldarine, Amanda Chiarelo. Uso de metáforas por aprendizes brasileiros bilíngues de inglês. 2024. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2024. |
url |
https://repositorio.pucsp.br/jspui/handle/handle/42184 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Pontifícia Universidade Católica de São Paulo |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Linguística Aplicada e Estudos da Linguagem |
dc.publisher.initials.fl_str_mv |
PUC-SP |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
Faculdade de Filosofia, Comunicação, Letras e Artes |
publisher.none.fl_str_mv |
Pontifícia Universidade Católica de São Paulo |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da PUC_SP instname:Pontifícia Universidade Católica de São Paulo (PUC-SP) instacron:PUC_SP |
instname_str |
Pontifícia Universidade Católica de São Paulo (PUC-SP) |
instacron_str |
PUC_SP |
institution |
PUC_SP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da PUC_SP |
collection |
Biblioteca Digital de Teses e Dissertações da PUC_SP |
bitstream.url.fl_str_mv |
https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/1/Amanda%20Chiarelo%20Boldarine.pdf https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/2/Amanda%20Chiarelo%20Boldarine.pdf.txt https://repositorio.pucsp.br/xmlui/bitstream/handle/42184/3/Amanda%20Chiarelo%20Boldarine.pdf.jpg |
bitstream.checksum.fl_str_mv |
cad676dde648d1d6e9b3e762e55c277e e93403eb70bd18fb67b2382e6f3e61c0 ec3562b4f2b8fb07ecd006dd847c938d |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da PUC_SP - Pontifícia Universidade Católica de São Paulo (PUC-SP) |
repository.mail.fl_str_mv |
bngkatende@pucsp.br||rapassi@pucsp.br |
_version_ |
1809277908192067584 |