Benchmarking natural language inference and semantic textual similarity for portuguese
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | por |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10174/32114 |
Resumo: | Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language. |
id |
RCAP_f9cbd764e4b5ae1876d34338675425cf |
---|---|
oai_identifier_str |
oai:dspace.uevora.pt:10174/32114 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Benchmarking natural language inference and semantic textual similarity for portugueseTwo sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language.MDPI2022-05-30T11:00:06Z2022-05-302020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/32114http://hdl.handle.net/10174/32114porPedro Fialho, Luı́sa Coheur, and Paulo Quaresma. Benchmarking natural language inference and semantic textual similarity for portuguese. Information, 11(10), 2020.ndndpq@uevora.pt283Fialho, PedroCoheur, LuísaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:32:30Zoai:dspace.uevora.pt:10174/32114Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:21:12.460385Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Benchmarking natural language inference and semantic textual similarity for portuguese |
title |
Benchmarking natural language inference and semantic textual similarity for portuguese |
spellingShingle |
Benchmarking natural language inference and semantic textual similarity for portuguese Fialho, Pedro |
title_short |
Benchmarking natural language inference and semantic textual similarity for portuguese |
title_full |
Benchmarking natural language inference and semantic textual similarity for portuguese |
title_fullStr |
Benchmarking natural language inference and semantic textual similarity for portuguese |
title_full_unstemmed |
Benchmarking natural language inference and semantic textual similarity for portuguese |
title_sort |
Benchmarking natural language inference and semantic textual similarity for portuguese |
author |
Fialho, Pedro |
author_facet |
Fialho, Pedro Coheur, Luísa Quaresma, Paulo |
author_role |
author |
author2 |
Coheur, Luísa Quaresma, Paulo |
author2_role |
author author |
dc.contributor.author.fl_str_mv |
Fialho, Pedro Coheur, Luísa Quaresma, Paulo |
description |
Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-01-01T00:00:00Z 2022-05-30T11:00:06Z 2022-05-30 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10174/32114 http://hdl.handle.net/10174/32114 |
url |
http://hdl.handle.net/10174/32114 |
dc.language.iso.fl_str_mv |
por |
language |
por |
dc.relation.none.fl_str_mv |
Pedro Fialho, Luı́sa Coheur, and Paulo Quaresma. Benchmarking natural language inference and semantic textual similarity for portuguese. Information, 11(10), 2020. nd nd pq@uevora.pt 283 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
MDPI |
publisher.none.fl_str_mv |
MDPI |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799136693994913792 |