DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?

Detalhes bibliográficos
Autor(a) principal: Cantos Gómez, Pacual
Data de Publicação: 2018
Tipo de documento: Artigo
Idioma: eng
Título da fonte: DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
Texto Completo: https://revistas.pucsp.br/index.php/delta/article/view/38792
Resumo: Statistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/LemmaToken/Lemma-Type formulae, cluster analysis and discriminant function analysis).
id PUC_SP-4_5f4e24d58cb775137fb85758fb6b2ca1
oai_identifier_str oai:ojs.pkp.sfu.ca:article/38792
network_acronym_str PUC_SP-4
network_name_str DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
repository_id_str
spelling DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?Quantitative analysisStatisticsLanguage modellingLinguistic corporaStatistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/LemmaToken/Lemma-Type formulae, cluster analysis and discriminant function analysis).Pontifícia Universidade Católica de São paulo2018-08-08info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.pucsp.br/index.php/delta/article/view/38792DELTA: Documentação e Estudos em Linguística Teórica e Aplicada; v. 18 n. 2 (2002)1678-460X0102-4450reponame:DELTA: Documentação de Estudos em Lingüística Teórica e Aplicadainstname:Pontifícia Universidade Católica de São Paulo (PUC-SP)instacron:PUC_SPenghttps://revistas.pucsp.br/index.php/delta/article/view/38792/26326Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicadainfo:eu-repo/semantics/openAccessCantos Gómez, Pacual2018-08-08T14:36:32Zoai:ojs.pkp.sfu.ca:article/38792Revistahttps://revistas.pucsp.br/deltaPRIhttps://revistas.pucsp.br/index.php/delta/oai||delta@pucsp.br1678-460X1678-460Xopendoar:2018-08-08T14:36:32DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada - Pontifícia Universidade Católica de São Paulo (PUC-SP)false
dc.title.none.fl_str_mv DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
title DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
spellingShingle DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
Cantos Gómez, Pacual
Quantitative analysis
Statistics
Language modelling
Linguistic corpora
title_short DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
title_full DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
title_fullStr DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
title_full_unstemmed DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
title_sort DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
author Cantos Gómez, Pacual
author_facet Cantos Gómez, Pacual
author_role author
dc.contributor.author.fl_str_mv Cantos Gómez, Pacual
dc.subject.por.fl_str_mv Quantitative analysis
Statistics
Language modelling
Linguistic corpora
topic Quantitative analysis
Statistics
Language modelling
Linguistic corpora
description Statistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/LemmaToken/Lemma-Type formulae, cluster analysis and discriminant function analysis).
publishDate 2018
dc.date.none.fl_str_mv 2018-08-08
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://revistas.pucsp.br/index.php/delta/article/view/38792
url https://revistas.pucsp.br/index.php/delta/article/view/38792
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://revistas.pucsp.br/index.php/delta/article/view/38792/26326
dc.rights.driver.fl_str_mv Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicada
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicada
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Pontifícia Universidade Católica de São paulo
publisher.none.fl_str_mv Pontifícia Universidade Católica de São paulo
dc.source.none.fl_str_mv DELTA: Documentação e Estudos em Linguística Teórica e Aplicada; v. 18 n. 2 (2002)
1678-460X
0102-4450
reponame:DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
instname:Pontifícia Universidade Católica de São Paulo (PUC-SP)
instacron:PUC_SP
instname_str Pontifícia Universidade Católica de São Paulo (PUC-SP)
instacron_str PUC_SP
institution PUC_SP
reponame_str DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
collection DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
repository.name.fl_str_mv DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada - Pontifícia Universidade Católica de São Paulo (PUC-SP)
repository.mail.fl_str_mv ||delta@pucsp.br
_version_ 1799129302825959424