Anaphora resolution without world knowledge

Detalhes bibliográficos
Autor(a) principal: Leffa, Vilson J.
Data de Publicação: 2018
Tipo de documento: Artigo
Idioma: eng
Título da fonte: DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
Texto Completo: https://revistas.pucsp.br/index.php/delta/article/view/38224
Resumo: A typical problem in the resolution of pronominal anaphora is the presence of more than one candidate for the antecedent of the pronoun. Considering two English sentences like (1) "People buy expensive cars because they offer more status" and (2) "People buy expensive cars because they want more status" we can see that the two NPs "people" and "expensive cars", from a purely syntactic perspective, are both legitimate candidates as antecedents for the pronoun "they". This problem has been traditionally solved by using world knowledge (e.g. schema theory), where, through an internal representation of the world, we "know" that cars "offer" status and people "want" status. The assumption in this paper is that the use of world knowledge does not explain how the disambiguation process works and alternative explanations should be explored. Using a knowledge poor approach (explicit information from the text rather than implicit world knowledge) the study investigates to what extent syntactic and semantic constraints can be used to resolve anaphora. For this purpose, 1,400 examples of the word "they" were randomly selected from a corpus of 10,000,000 words of expository text in English. Antecedent candidates for each case were then analyzed and classified in terms of their syntactic functions in the sentence (subject, object, etc.) and semantic features (+ human, + animate, etc.). It was found that syntactic constraints resolved 85% of the cases. When combined with semantic constraints the resolution rate rose to 98%. The implications of the findings for Natural Language Processing are discussed.
id PUC_SP-4_c681546532aa319735f0cfded0d4f2a5
oai_identifier_str oai:ojs.pkp.sfu.ca:article/38224
network_acronym_str PUC_SP-4
network_name_str DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
repository_id_str
spelling Anaphora resolution without world knowledgeAnaphora resolution without world knowledgeAnaphora ResolutionNatural Language ProcessingTextual ConstraintsAmbiguityAnaphora ResolutionNatural Language ProcessingTextual ConstraintsAmbiguityA typical problem in the resolution of pronominal anaphora is the presence of more than one candidate for the antecedent of the pronoun. Considering two English sentences like (1) "People buy expensive cars because they offer more status" and (2) "People buy expensive cars because they want more status" we can see that the two NPs "people" and "expensive cars", from a purely syntactic perspective, are both legitimate candidates as antecedents for the pronoun "they". This problem has been traditionally solved by using world knowledge (e.g. schema theory), where, through an internal representation of the world, we "know" that cars "offer" status and people "want" status. The assumption in this paper is that the use of world knowledge does not explain how the disambiguation process works and alternative explanations should be explored. Using a knowledge poor approach (explicit information from the text rather than implicit world knowledge) the study investigates to what extent syntactic and semantic constraints can be used to resolve anaphora. For this purpose, 1,400 examples of the word "they" were randomly selected from a corpus of 10,000,000 words of expository text in English. Antecedent candidates for each case were then analyzed and classified in terms of their syntactic functions in the sentence (subject, object, etc.) and semantic features (+ human, + animate, etc.). It was found that syntactic constraints resolved 85% of the cases. When combined with semantic constraints the resolution rate rose to 98%. The implications of the findings for Natural Language Processing are discussed.A typical problem in the resolution of pronominal anaphora is the presence of more than one candidate for the antecedent of the pronoun. Considering two English sentences like (1) "People buy expensive cars because they offer more status" and (2) "People buy expensive cars because they want more status" we can see that the two NPs "people" and "expensive cars", from a purely syntactic perspective, are both legitimate candidates as antecedents for the pronoun "they". This problem has been traditionally solved by using world knowledge (e.g. schema theory), where, through an internal representation of the world, we "know" that cars "offer" status and people "want" status. The assumption in this paper is that the use of world knowledge does not explain how the disambiguation process works and alternative explanations should be explored. Using a knowledge poor approach (explicit information from the text rather than implicit world knowledge) the study investigates to what extent syntactic and semantic constraints can be used to resolve anaphora. For this purpose, 1,400 examples of the word "they" were randomly selected from a corpus of 10,000,000 words of expository text in English. Antecedent candidates for each case were then analyzed and classified in terms of their syntactic functions in the sentence (subject, object, etc.) and semantic features (+ human, + animate, etc.). It was found that syntactic constraints resolved 85% of the cases. When combined with semantic constraints the resolution rate rose to 98%. The implications of the findings for Natural Language Processing are discussed.Pontifícia Universidade Católica de São paulo2018-07-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://revistas.pucsp.br/index.php/delta/article/view/38224DELTA: Documentação e Estudos em Linguística Teórica e Aplicada; v. 19 n. 1 (2003)1678-460X0102-4450reponame:DELTA: Documentação de Estudos em Lingüística Teórica e Aplicadainstname:Pontifícia Universidade Católica de São Paulo (PUC-SP)instacron:PUC_SPenghttps://revistas.pucsp.br/index.php/delta/article/view/38224/25932Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicadainfo:eu-repo/semantics/openAccessLeffa, Vilson J.2018-07-10T17:50:16Zoai:ojs.pkp.sfu.ca:article/38224Revistahttps://revistas.pucsp.br/deltaPRIhttps://revistas.pucsp.br/index.php/delta/oai||delta@pucsp.br1678-460X1678-460Xopendoar:2018-07-10T17:50:16DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada - Pontifícia Universidade Católica de São Paulo (PUC-SP)false
dc.title.none.fl_str_mv Anaphora resolution without world knowledge
Anaphora resolution without world knowledge
title Anaphora resolution without world knowledge
spellingShingle Anaphora resolution without world knowledge
Leffa, Vilson J.
Anaphora Resolution
Natural Language Processing
Textual Constraints
Ambiguity
Anaphora Resolution
Natural Language Processing
Textual Constraints
Ambiguity
title_short Anaphora resolution without world knowledge
title_full Anaphora resolution without world knowledge
title_fullStr Anaphora resolution without world knowledge
title_full_unstemmed Anaphora resolution without world knowledge
title_sort Anaphora resolution without world knowledge
author Leffa, Vilson J.
author_facet Leffa, Vilson J.
author_role author
dc.contributor.author.fl_str_mv Leffa, Vilson J.
dc.subject.por.fl_str_mv Anaphora Resolution
Natural Language Processing
Textual Constraints
Ambiguity
Anaphora Resolution
Natural Language Processing
Textual Constraints
Ambiguity
topic Anaphora Resolution
Natural Language Processing
Textual Constraints
Ambiguity
Anaphora Resolution
Natural Language Processing
Textual Constraints
Ambiguity
description A typical problem in the resolution of pronominal anaphora is the presence of more than one candidate for the antecedent of the pronoun. Considering two English sentences like (1) "People buy expensive cars because they offer more status" and (2) "People buy expensive cars because they want more status" we can see that the two NPs "people" and "expensive cars", from a purely syntactic perspective, are both legitimate candidates as antecedents for the pronoun "they". This problem has been traditionally solved by using world knowledge (e.g. schema theory), where, through an internal representation of the world, we "know" that cars "offer" status and people "want" status. The assumption in this paper is that the use of world knowledge does not explain how the disambiguation process works and alternative explanations should be explored. Using a knowledge poor approach (explicit information from the text rather than implicit world knowledge) the study investigates to what extent syntactic and semantic constraints can be used to resolve anaphora. For this purpose, 1,400 examples of the word "they" were randomly selected from a corpus of 10,000,000 words of expository text in English. Antecedent candidates for each case were then analyzed and classified in terms of their syntactic functions in the sentence (subject, object, etc.) and semantic features (+ human, + animate, etc.). It was found that syntactic constraints resolved 85% of the cases. When combined with semantic constraints the resolution rate rose to 98%. The implications of the findings for Natural Language Processing are discussed.
publishDate 2018
dc.date.none.fl_str_mv 2018-07-10
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://revistas.pucsp.br/index.php/delta/article/view/38224
url https://revistas.pucsp.br/index.php/delta/article/view/38224
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://revistas.pucsp.br/index.php/delta/article/view/38224/25932
dc.rights.driver.fl_str_mv Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicada
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Copyright (c) 2018 DELTA: Documentação e Estudos em Linguística Teórica e Aplicada
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Pontifícia Universidade Católica de São paulo
publisher.none.fl_str_mv Pontifícia Universidade Católica de São paulo
dc.source.none.fl_str_mv DELTA: Documentação e Estudos em Linguística Teórica e Aplicada; v. 19 n. 1 (2003)
1678-460X
0102-4450
reponame:DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
instname:Pontifícia Universidade Católica de São Paulo (PUC-SP)
instacron:PUC_SP
instname_str Pontifícia Universidade Católica de São Paulo (PUC-SP)
instacron_str PUC_SP
institution PUC_SP
reponame_str DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
collection DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada
repository.name.fl_str_mv DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada - Pontifícia Universidade Católica de São Paulo (PUC-SP)
repository.mail.fl_str_mv ||delta@pucsp.br
_version_ 1799129302443229184