From lexical to semantic features in paraphrase identification

Detalhes bibliográficos
Autor(a) principal: Fialho, Pedro
Data de Publicação: 2019
Outros Autores: Coheur, Luísa, Quaresma, Paulo
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10174/26991
Resumo: The task of paraphrase identification has been applied to diverse scenarios in Natural Language Processing, such as Machine Translation, summarization, or plagiarism detection. In this paper we present a comparative study on the performance of lexical, syntactic and semantic features in the task of paraphrase identification in the Microsoft Research Paraphrase Corpus. In our experiments, semantic features do not represent a gain in results, and syntactic features lead to the best results, but only if combined with lexical features.
id RCAP_e038511eca05c5449783c045c6cf31e9
oai_identifier_str oai:dspace.uevora.pt:10174/26991
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling From lexical to semantic features in paraphrase identificationThe task of paraphrase identification has been applied to diverse scenarios in Natural Language Processing, such as Machine Translation, summarization, or plagiarism detection. In this paper we present a comparative study on the performance of lexical, syntactic and semantic features in the task of paraphrase identification in the Microsoft Research Paraphrase Corpus. In our experiments, semantic features do not represent a gain in results, and syntactic features lead to the best results, but only if combined with lexical features.OpenAccess Series in Informatics. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik2020-02-18T09:14:13Z2020-02-182019-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/26991http://hdl.handle.net/10174/26991engndndpq@uevora.pt283Fialho, PedroCoheur, LuísaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:22:20Zoai:dspace.uevora.pt:10174/26991Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:17:14.287646Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv From lexical to semantic features in paraphrase identification
title From lexical to semantic features in paraphrase identification
spellingShingle From lexical to semantic features in paraphrase identification
Fialho, Pedro
title_short From lexical to semantic features in paraphrase identification
title_full From lexical to semantic features in paraphrase identification
title_fullStr From lexical to semantic features in paraphrase identification
title_full_unstemmed From lexical to semantic features in paraphrase identification
title_sort From lexical to semantic features in paraphrase identification
author Fialho, Pedro
author_facet Fialho, Pedro
Coheur, Luísa
Quaresma, Paulo
author_role author
author2 Coheur, Luísa
Quaresma, Paulo
author2_role author
author
dc.contributor.author.fl_str_mv Fialho, Pedro
Coheur, Luísa
Quaresma, Paulo
description The task of paraphrase identification has been applied to diverse scenarios in Natural Language Processing, such as Machine Translation, summarization, or plagiarism detection. In this paper we present a comparative study on the performance of lexical, syntactic and semantic features in the task of paraphrase identification in the Microsoft Research Paraphrase Corpus. In our experiments, semantic features do not represent a gain in results, and syntactic features lead to the best results, but only if combined with lexical features.
publishDate 2019
dc.date.none.fl_str_mv 2019-01-01T00:00:00Z
2020-02-18T09:14:13Z
2020-02-18
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10174/26991
http://hdl.handle.net/10174/26991
url http://hdl.handle.net/10174/26991
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv nd
nd
pq@uevora.pt
283
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv OpenAccess Series in Informatics. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
publisher.none.fl_str_mv OpenAccess Series in Informatics. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799136654582087680