As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Tipo de documento: | Dissertação |
Idioma: | por |
Título da fonte: | Biblioteca Digital de Teses e Dissertações da PUC_SP |
Texto Completo: | https://repositorio.pucsp.br/jspui/handle/handle/24088 |
Resumo: | Language serves as a means of social indexing, varying between groups of speakers and allowing social groups to distinguish between insiders and outsiders. In other words, it allows us to distinguish between individuals who belong and do not belong to a particular group (HALL, 2016; BAKER, 2010). The overall purpose of this study is threefold: (a) to identify the subject matters, themes, or discourses produced by black and white speakers of Brazilian Portuguese, (b) to determine which of these subject matters, themes or discourses are similar or different among black and white speakers, and (c) to analyze whether the ethnic-racial component of its speakers is likely to be predicted by the lexical variables used in the texts. These goals were pursued from a Corpus Linguistics perspective (HUNSTON; ESIMAJE, 2019; BAKER; EGBERT, 2016; BIBER, 2009; BERBER SARDINHA, 2004). Therefore, a corpus has been collected to represent speaking situations in which black and white speakers are involved in Brazilian Portuguese. The corpus was called ‘Corpus of Ethnic-Racial Registers’, whose name in Portuguese is Corpus de Registros Étnico-Raciais (CRER), comprising 788 texts representing several samples of language in use from different registers (namely popular songs, vlogs, and oral life histories) in Brazilian Portuguese. The texts for each register (and subregisters) in the corpus were balanced for black and white speakers. Once the corpus was fully built, the sample files were morphosyntactically tagged and the words tagged as nouns, adjectives, verbs and adverbs were selected for each text. The counts for those words were then normalized by 1,000 words. The resulting lexis variables were analyzed using two methods: Lexical Multidimensional Analysis (BERBER SARDINHA, 2014) and Discriminant Functional Analysis (BERBER SARDINHA; VEIRANO PINTO, 2014, 2019; NORRIS, 2015). For the former method, dimensions of lexical variation were statistically identified and communicatively interpreted to find the subject matters, themes or discourses. The statistical comparisons showed a high degree of difference across the registers for the dimensions, indicating that register is a powerful predictor of variation for language use, which corroborates previous multi-dimensional studies (BIBER, 1988; BERBER SARDINHA; VEIRANO PINTO, 2014, 2019). At the same time, the dimensions failed to reveal variation between black and white speakers. For the latter method, the selected lexis from the texts was entered in a Discriminant Functional Analysis returning sets of words considered predictors of each ethnic group. The statistical comparisons showed a high degree of accurate prediction for both ethnic groups (88,13% for blacks, and 72,29% for whites). Additionally, the predicting words of each group were interpreted in connection with the underlying subject matters, themes and discourses, which surprisingly revealed subtle, yet undeniable traces of discrimination, racism and bias toward blacks. In short, both methods showed different outcomes; while the dimensions of lexical variation did not reveal differences between black and white speakers, the Discriminant Analysis pointed out clear differences for both groups. Thus, we were able to conclude that the speakers’ ethnic group can be identified through the lexis employed in their subject matters, themes and discourses and, unexpectedly, that there is hidden racial discrimination and prejudice in the topics black and white Brazilians talk about every day |
id |
PUC_SP-1_add2387fb0abc8a9e9058bb494fb01f7 |
---|---|
oai_identifier_str |
oai:repositorio.pucsp.br:handle/24088 |
network_acronym_str |
PUC_SP-1 |
network_name_str |
Biblioteca Digital de Teses e Dissertações da PUC_SP |
repository_id_str |
|
spelling |
Sardinha, Antonio Paulo Berberhttp://lattes.cnpq.br/6940454346543706Souza, Rafael Webster Ferreira de2021-11-26T19:59:30Z2021-11-26T19:59:30Z2020-03-12Souza, Rafael Webster Ferreira de. As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus. 2020. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Estudos Pós-Graduados em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2020.https://repositorio.pucsp.br/jspui/handle/handle/24088Language serves as a means of social indexing, varying between groups of speakers and allowing social groups to distinguish between insiders and outsiders. In other words, it allows us to distinguish between individuals who belong and do not belong to a particular group (HALL, 2016; BAKER, 2010). The overall purpose of this study is threefold: (a) to identify the subject matters, themes, or discourses produced by black and white speakers of Brazilian Portuguese, (b) to determine which of these subject matters, themes or discourses are similar or different among black and white speakers, and (c) to analyze whether the ethnic-racial component of its speakers is likely to be predicted by the lexical variables used in the texts. These goals were pursued from a Corpus Linguistics perspective (HUNSTON; ESIMAJE, 2019; BAKER; EGBERT, 2016; BIBER, 2009; BERBER SARDINHA, 2004). Therefore, a corpus has been collected to represent speaking situations in which black and white speakers are involved in Brazilian Portuguese. The corpus was called ‘Corpus of Ethnic-Racial Registers’, whose name in Portuguese is Corpus de Registros Étnico-Raciais (CRER), comprising 788 texts representing several samples of language in use from different registers (namely popular songs, vlogs, and oral life histories) in Brazilian Portuguese. The texts for each register (and subregisters) in the corpus were balanced for black and white speakers. Once the corpus was fully built, the sample files were morphosyntactically tagged and the words tagged as nouns, adjectives, verbs and adverbs were selected for each text. The counts for those words were then normalized by 1,000 words. The resulting lexis variables were analyzed using two methods: Lexical Multidimensional Analysis (BERBER SARDINHA, 2014) and Discriminant Functional Analysis (BERBER SARDINHA; VEIRANO PINTO, 2014, 2019; NORRIS, 2015). For the former method, dimensions of lexical variation were statistically identified and communicatively interpreted to find the subject matters, themes or discourses. The statistical comparisons showed a high degree of difference across the registers for the dimensions, indicating that register is a powerful predictor of variation for language use, which corroborates previous multi-dimensional studies (BIBER, 1988; BERBER SARDINHA; VEIRANO PINTO, 2014, 2019). At the same time, the dimensions failed to reveal variation between black and white speakers. For the latter method, the selected lexis from the texts was entered in a Discriminant Functional Analysis returning sets of words considered predictors of each ethnic group. The statistical comparisons showed a high degree of accurate prediction for both ethnic groups (88,13% for blacks, and 72,29% for whites). Additionally, the predicting words of each group were interpreted in connection with the underlying subject matters, themes and discourses, which surprisingly revealed subtle, yet undeniable traces of discrimination, racism and bias toward blacks. In short, both methods showed different outcomes; while the dimensions of lexical variation did not reveal differences between black and white speakers, the Discriminant Analysis pointed out clear differences for both groups. Thus, we were able to conclude that the speakers’ ethnic group can be identified through the lexis employed in their subject matters, themes and discourses and, unexpectedly, that there is hidden racial discrimination and prejudice in the topics black and white Brazilians talk about every dayA língua serve como meio de indexação social, variando entre grupos de falantes e permitindo que grupos sociais se distingam entre internos e externos. Em outras palavras, permite distinguir entre indivíduos que pertencem e não pertencem a um determinado grupo (HALL, 2016; BAKER, 2010). Desta forma, esta pesquisa tem como objetivo (a) identificar os assuntos, temas ou discursos produzidos por falantes negros e brancos, (b) determinar quais desses assuntos, temas ou discursos são semelhantes e diferentes entre estes dois grupos étnicos pesquisados, e (c) averiguar se o componente étnico-racial do falante do texto pode ser previsto pelas variáveis lexicais empregadas nos assuntos, temas e discursos dos textos. Esses objetivos estão alicerçados teórica e metodologicamente na Linguística de Corpus (HUNSTON; ESIMAJE, 2019; BAKER; EGBERT, 2016; BIBER, 2009; BERBER SARDINHA, 2004). Para a consecução da pesquisa, foi coletado um corpus para representar situações de língua em uso nos quais falantes negros e brancos do português brasileiro estão envolvidos. O corpus foi nomeado como Corpus de Registros Étnico-Raciais (CRER), contemplando 788 textos representando a língua em uso em diferentes registros (Letras de Músicas, Vlogs, e Relatos Orais de Vida) em língua portuguesa brasileira. Os textos dos registros (e subregistros) do corpus foram coletados de forma que amostras de língua em uso estivessem balanceadas para os falantes negros e brancos. Uma vez coletado o corpus, os arquivos foram etiquetados morfossintaticamente e as palavras etiquetadas como substantivos, adjetivos, verbos e advérbios foram selecionadas de cada texto. As contagens dessas palavras foram então normalizadas por 1.000 palavras e a amostra resultante do léxico foi analisada através de dois métodos – Análise Multidimensional Lexical (BERBER SARDINHA, 2014) e Análise Discriminante (BERBER SARDINHA; VEIRANO PINTO, 2014, 2019; NORRIS, 2015). Para o primeiro método, as dimensões de variação lexical foram estatisticamente identificadas e comunicativamente interpretadas a fim de extrair os assuntos, temas e discursos presentes nelas. As comparações estatísticas mostraram um alto grau de diferença entre os registros nas dimensões, indicando que o registro é um forte preditor de variação da língua em uso, o que corrobora os estudos multidimensionais anteriores, como (BIBER, 1998; BERBER SARDINHA; VEIRANO PINTO, 2014, 2019). Ao mesmo tempo, as dimensões não se demonstraram capazes de revelar variação entre falantes negros e brancos. Para o segundo método, o léxico selecionado foi utilizado em uma Análise Discriminante Funcional que resultou em variáveis lexicais preditores para cada grupo étnico. As comparações estatísticas mostraram um nível alto de predição para ambos os grupos étnicos, sendo 88,3% para negros e 72,29% para brancos. As variáveis preditoras foram, ainda, interpretadas de acordo com os seus assuntos, temas e discursos e revelaram, surpreendentemente, mesmo que de forma sutil, traços inegáveis de discriminação, racismo e preconceito aos negros. Em suma, ambos os métodos produziram resultados diferentes; enquanto as dimensões de variação lexical não revelaram diferenças entre os falantes negros e brancos, a Análise Discriminante evidenciou diferenças marcantes entre os dois grupos. Assim, podemos concluir que o pertencimento étnico-racial dos falantes pode ser previsto pelo léxico empregado em seus assuntos, temas e discursos e, de forma inesperada, que há discriminação e preconceito racial ocultos nos assuntos, temas e discursos que os falantes negros e brancos brasileiros produzem em seu dia a diaCoordenação de Aperfeiçoamento de Pessoal de Nível Superior – CAPESporPontifícia Universidade Católica de São PauloPrograma de Estudos Pós-Graduados em Linguística Aplicada e Estudos da LinguagemPUC-SPBrasilFaculdade de Filosofia, Comunicação, Letras e ArtesCNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADADiscriminação racialNegrosBrancosLinguística de corpusGrupos étnicosLinguagem e línguas - VariaçãoRace discriminationBlacksWhitesCorpora (Linguistics)Ethnic groupsLanguage and languages - VariationAs marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpusThe lexical marks of the ethnic-racial discrimination between blacks and whites: a Corpus Linguistics studyinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações da PUC_SPinstname:Pontifícia Universidade Católica de São Paulo (PUC-SP)instacron:PUC_SPORIGINALRafael Webster F. Souza.pdfapplication/pdf2574481https://repositorio.pucsp.br/xmlui/bitstream/handle/24088/1/Rafael%20Webster%20F.%20Souza.pdf660fbaa108be4aa99432be973ace1a2eMD51TEXTRafael Webster F. Souza.pdf.txtRafael Webster F. Souza.pdf.txtExtracted texttext/plain360128https://repositorio.pucsp.br/xmlui/bitstream/handle/24088/2/Rafael%20Webster%20F.%20Souza.pdf.txt16fb9d324d05dc425a8e3294c6b31c7dMD52THUMBNAILRafael Webster F. Souza.pdf.jpgRafael Webster F. Souza.pdf.jpgGenerated Thumbnailimage/jpeg1253https://repositorio.pucsp.br/xmlui/bitstream/handle/24088/3/Rafael%20Webster%20F.%20Souza.pdf.jpgf580bc456d722ad67fbfa33ba53547ceMD53handle/240882021-11-29 12:10:46.382oai:repositorio.pucsp.br:handle/24088Biblioteca Digital de Teses e Dissertaçõeshttps://sapientia.pucsp.br/https://sapientia.pucsp.br/oai/requestbngkatende@pucsp.br||rapassi@pucsp.bropendoar:2021-11-29T15:10:46Biblioteca Digital de Teses e Dissertações da PUC_SP - Pontifícia Universidade Católica de São Paulo (PUC-SP)false |
dc.title.pt_BR.fl_str_mv |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
dc.title.alternative.en_US.fl_str_mv |
The lexical marks of the ethnic-racial discrimination between blacks and whites: a Corpus Linguistics study |
title |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
spellingShingle |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus Souza, Rafael Webster Ferreira de CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA Discriminação racial Negros Brancos Linguística de corpus Grupos étnicos Linguagem e línguas - Variação Race discrimination Blacks Whites Corpora (Linguistics) Ethnic groups Language and languages - Variation |
title_short |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
title_full |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
title_fullStr |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
title_full_unstemmed |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
title_sort |
As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus |
author |
Souza, Rafael Webster Ferreira de |
author_facet |
Souza, Rafael Webster Ferreira de |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Sardinha, Antonio Paulo Berber |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/6940454346543706 |
dc.contributor.author.fl_str_mv |
Souza, Rafael Webster Ferreira de |
contributor_str_mv |
Sardinha, Antonio Paulo Berber |
dc.subject.cnpq.fl_str_mv |
CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA |
topic |
CNPQ::LINGUISTICA, LETRAS E ARTES::LINGUISTICA::LINGUISTICA APLICADA Discriminação racial Negros Brancos Linguística de corpus Grupos étnicos Linguagem e línguas - Variação Race discrimination Blacks Whites Corpora (Linguistics) Ethnic groups Language and languages - Variation |
dc.subject.por.fl_str_mv |
Discriminação racial Negros Brancos Linguística de corpus Grupos étnicos Linguagem e línguas - Variação |
dc.subject.eng.fl_str_mv |
Race discrimination Blacks Whites Corpora (Linguistics) Ethnic groups Language and languages - Variation |
description |
Language serves as a means of social indexing, varying between groups of speakers and allowing social groups to distinguish between insiders and outsiders. In other words, it allows us to distinguish between individuals who belong and do not belong to a particular group (HALL, 2016; BAKER, 2010). The overall purpose of this study is threefold: (a) to identify the subject matters, themes, or discourses produced by black and white speakers of Brazilian Portuguese, (b) to determine which of these subject matters, themes or discourses are similar or different among black and white speakers, and (c) to analyze whether the ethnic-racial component of its speakers is likely to be predicted by the lexical variables used in the texts. These goals were pursued from a Corpus Linguistics perspective (HUNSTON; ESIMAJE, 2019; BAKER; EGBERT, 2016; BIBER, 2009; BERBER SARDINHA, 2004). Therefore, a corpus has been collected to represent speaking situations in which black and white speakers are involved in Brazilian Portuguese. The corpus was called ‘Corpus of Ethnic-Racial Registers’, whose name in Portuguese is Corpus de Registros Étnico-Raciais (CRER), comprising 788 texts representing several samples of language in use from different registers (namely popular songs, vlogs, and oral life histories) in Brazilian Portuguese. The texts for each register (and subregisters) in the corpus were balanced for black and white speakers. Once the corpus was fully built, the sample files were morphosyntactically tagged and the words tagged as nouns, adjectives, verbs and adverbs were selected for each text. The counts for those words were then normalized by 1,000 words. The resulting lexis variables were analyzed using two methods: Lexical Multidimensional Analysis (BERBER SARDINHA, 2014) and Discriminant Functional Analysis (BERBER SARDINHA; VEIRANO PINTO, 2014, 2019; NORRIS, 2015). For the former method, dimensions of lexical variation were statistically identified and communicatively interpreted to find the subject matters, themes or discourses. The statistical comparisons showed a high degree of difference across the registers for the dimensions, indicating that register is a powerful predictor of variation for language use, which corroborates previous multi-dimensional studies (BIBER, 1988; BERBER SARDINHA; VEIRANO PINTO, 2014, 2019). At the same time, the dimensions failed to reveal variation between black and white speakers. For the latter method, the selected lexis from the texts was entered in a Discriminant Functional Analysis returning sets of words considered predictors of each ethnic group. The statistical comparisons showed a high degree of accurate prediction for both ethnic groups (88,13% for blacks, and 72,29% for whites). Additionally, the predicting words of each group were interpreted in connection with the underlying subject matters, themes and discourses, which surprisingly revealed subtle, yet undeniable traces of discrimination, racism and bias toward blacks. In short, both methods showed different outcomes; while the dimensions of lexical variation did not reveal differences between black and white speakers, the Discriminant Analysis pointed out clear differences for both groups. Thus, we were able to conclude that the speakers’ ethnic group can be identified through the lexis employed in their subject matters, themes and discourses and, unexpectedly, that there is hidden racial discrimination and prejudice in the topics black and white Brazilians talk about every day |
publishDate |
2020 |
dc.date.issued.fl_str_mv |
2020-03-12 |
dc.date.accessioned.fl_str_mv |
2021-11-26T19:59:30Z |
dc.date.available.fl_str_mv |
2021-11-26T19:59:30Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.citation.fl_str_mv |
Souza, Rafael Webster Ferreira de. As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus. 2020. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Estudos Pós-Graduados em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2020. |
dc.identifier.uri.fl_str_mv |
https://repositorio.pucsp.br/jspui/handle/handle/24088 |
identifier_str_mv |
Souza, Rafael Webster Ferreira de. As marcas lexicais da discriminação étnico-racial entre negros e brancos: um estudo da linguística de corpus. 2020. Dissertação (Mestrado em Linguística Aplicada e Estudos da Linguagem) - Programa de Estudos Pós-Graduados em Linguística Aplicada e Estudos da Linguagem da Pontifícia Universidade Católica de São Paulo, São Paulo, 2020. |
url |
https://repositorio.pucsp.br/jspui/handle/handle/24088 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Pontifícia Universidade Católica de São Paulo |
dc.publisher.program.fl_str_mv |
Programa de Estudos Pós-Graduados em Linguística Aplicada e Estudos da Linguagem |
dc.publisher.initials.fl_str_mv |
PUC-SP |
dc.publisher.country.fl_str_mv |
Brasil |
dc.publisher.department.fl_str_mv |
Faculdade de Filosofia, Comunicação, Letras e Artes |
publisher.none.fl_str_mv |
Pontifícia Universidade Católica de São Paulo |
dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da PUC_SP instname:Pontifícia Universidade Católica de São Paulo (PUC-SP) instacron:PUC_SP |
instname_str |
Pontifícia Universidade Católica de São Paulo (PUC-SP) |
instacron_str |
PUC_SP |
institution |
PUC_SP |
reponame_str |
Biblioteca Digital de Teses e Dissertações da PUC_SP |
collection |
Biblioteca Digital de Teses e Dissertações da PUC_SP |
bitstream.url.fl_str_mv |
https://repositorio.pucsp.br/xmlui/bitstream/handle/24088/1/Rafael%20Webster%20F.%20Souza.pdf https://repositorio.pucsp.br/xmlui/bitstream/handle/24088/2/Rafael%20Webster%20F.%20Souza.pdf.txt https://repositorio.pucsp.br/xmlui/bitstream/handle/24088/3/Rafael%20Webster%20F.%20Souza.pdf.jpg |
bitstream.checksum.fl_str_mv |
660fbaa108be4aa99432be973ace1a2e 16fb9d324d05dc425a8e3294c6b31c7d f580bc456d722ad67fbfa33ba53547ce |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da PUC_SP - Pontifícia Universidade Católica de São Paulo (PUC-SP) |
repository.mail.fl_str_mv |
bngkatende@pucsp.br||rapassi@pucsp.br |
_version_ |
1809278002666668032 |