DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada |
Texto Completo: | https://revistas.pucsp.br/index.php/delta/article/view/38792 |
Resumo: | Statistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/LemmaToken/Lemma-Type formulae, cluster analysis and discriminant function analysis). |
id |
PUC_SP-4_5f4e24d58cb775137fb85758fb6b2ca1 |
---|---|
oai_identifier_str |
oai:ojs.pkp.sfu.ca:article/38792 |
network_acronym_str |
PUC_SP-4 |
network_name_str |
DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada |
repository_id_str |
|
spelling |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS?Quantitative analysisStatisticsLanguage modellingLinguistic corporaStatistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/LemmaToken/Lemma-Type formulae, cluster analysis and discriminant function analysis).Pontifícia Universidade Católica de São paulo2018-08-08info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.pucsp.br/index.php/delta/article/view/38792DELTA: Documentação e Estudos em Linguística Teórica e Aplicada; v. 18 n. 2 (2002)1678-460X0102-4450reponame:DELTA: Documentação de Estudos em Lingüística Teórica e Aplicadainstname:Pontifícia Universidade Católica de São Paulo (PUC-SP)instacron:PUC_SPenghttps://revistas.pucsp.br/index.php/delta/article/view/38792/26326Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicadainfo:eu-repo/semantics/openAccessCantos Gómez, Pacual2018-08-08T14:36:32Zoai:ojs.pkp.sfu.ca:article/38792Revistahttps://revistas.pucsp.br/deltaPRIhttps://revistas.pucsp.br/index.php/delta/oai||delta@pucsp.br1678-460X1678-460Xopendoar:2018-08-08T14:36:32DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada - Pontifícia Universidade Católica de São Paulo (PUC-SP)false |
dc.title.none.fl_str_mv |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
title |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
spellingShingle |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? Cantos Gómez, Pacual Quantitative analysis Statistics Language modelling Linguistic corpora |
title_short |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
title_full |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
title_fullStr |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
title_full_unstemmed |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
title_sort |
DO WE NEED STATISTICS WHEN WE HAVE LINGUISTICS? |
author |
Cantos Gómez, Pacual |
author_facet |
Cantos Gómez, Pacual |
author_role |
author |
dc.contributor.author.fl_str_mv |
Cantos Gómez, Pacual |
dc.subject.por.fl_str_mv |
Quantitative analysis Statistics Language modelling Linguistic corpora |
topic |
Quantitative analysis Statistics Language modelling Linguistic corpora |
description |
Statistics is known to be a quantitative approach to research. However, most of the research done in the fields of language and linguistics is of a different kind, namely qualitative. Succinctly, qualitative analysis differs from quantitative analysis is that in the former no attempt is made to assign frequencies, percentages and the like, to the linguistic features found or identified in the data. In quantitative research, linguistic features are classified and counted, and even more complex statistical models are constructed in order to explain these observed facts. In qualitative research, however, we use the data only for identifying and describing features of language usage and for providing real occurrences/examples of particular phenomena. In this paper, we shall try to show how quantitative methods and statistical techniques can supplement qualitative analyses of language. We shall attempt to present some mathematical and statistical properties of natural languages, and introduce some of the quantitative methods which are of the most value in working empirically with texts and corpora, illustrating the various issues with numerous examples and moving from the most basic descriptive techniques (frequency counts and percentages) to decision-taking techniques (chi-square and z-score) and to more sophisticated statistical language models (Type-Token/LemmaToken/Lemma-Type formulae, cluster analysis and discriminant function analysis). |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-08-08 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://revistas.pucsp.br/index.php/delta/article/view/38792 |
url |
https://revistas.pucsp.br/index.php/delta/article/view/38792 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://revistas.pucsp.br/index.php/delta/article/view/38792/26326 |
dc.rights.driver.fl_str_mv |
Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicada info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicada |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Pontifícia Universidade Católica de São paulo |
publisher.none.fl_str_mv |
Pontifícia Universidade Católica de São paulo |
dc.source.none.fl_str_mv |
DELTA: Documentação e Estudos em Linguística Teórica e Aplicada; v. 18 n. 2 (2002) 1678-460X 0102-4450 reponame:DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada instname:Pontifícia Universidade Católica de São Paulo (PUC-SP) instacron:PUC_SP |
instname_str |
Pontifícia Universidade Católica de São Paulo (PUC-SP) |
instacron_str |
PUC_SP |
institution |
PUC_SP |
reponame_str |
DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada |
collection |
DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada |
repository.name.fl_str_mv |
DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada - Pontifícia Universidade Católica de São Paulo (PUC-SP) |
repository.mail.fl_str_mv |
||delta@pucsp.br |
_version_ |
1799129302825959424 |