Benchmarking natural language inference and semantic textual similarity for portuguese

Fialho, Pedro; Coheur, Luísa; Quaresma, Paulo

Benchmarking natural language inference and semantic textual similarity for portuguese

Detalhes bibliográficos
Autor(a) principal:	Fialho, Pedro
Data de Publicação:	2020
Outros Autores:	Coheur, Luísa, Quaresma, Paulo
Tipo de documento:	Artigo
Idioma:	por
Título da fonte:	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo:	http://hdl.handle.net/10174/32114
Resumo:	Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language.

Metadados do item

id	RCAP_f9cbd764e4b5ae1876d34338675425cf
oai_identifier_str	oai:dspace.uevora.pt:10174/32114
network_acronym_str	RCAP
network_name_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str	7160
spelling	Benchmarking natural language inference and semantic textual similarity for portugueseTwo sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language.MDPI2022-05-30T11:00:06Z2022-05-302020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://hdl.handle.net/10174/32114http://hdl.handle.net/10174/32114porPedro Fialho, Luı́sa Coheur, and Paulo Quaresma. Benchmarking natural language inference and semantic textual similarity for portuguese. Information, 11(10), 2020.ndndpq@uevora.pt283Fialho, PedroCoheur, LuísaQuaresma, Pauloinfo:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-01-03T19:32:30Zoai:dspace.uevora.pt:10174/32114Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T01:21:12.460385Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv	Benchmarking natural language inference and semantic textual similarity for portuguese
title	Benchmarking natural language inference and semantic textual similarity for portuguese
spellingShingle	Benchmarking natural language inference and semantic textual similarity for portuguese Fialho, Pedro
title_short	Benchmarking natural language inference and semantic textual similarity for portuguese
title_full	Benchmarking natural language inference and semantic textual similarity for portuguese
title_fullStr	Benchmarking natural language inference and semantic textual similarity for portuguese
title_full_unstemmed	Benchmarking natural language inference and semantic textual similarity for portuguese
title_sort	Benchmarking natural language inference and semantic textual similarity for portuguese
author	Fialho, Pedro
author_facet	Fialho, Pedro Coheur, Luísa Quaresma, Paulo
author_role	author
author2	Coheur, Luísa Quaresma, Paulo
author2_role	author author
dc.contributor.author.fl_str_mv	Fialho, Pedro Coheur, Luísa Quaresma, Paulo
description	Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language.
publishDate	2020
dc.date.none.fl_str_mv	2020-01-01T00:00:00Z 2022-05-30T11:00:06Z 2022-05-30
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10174/32114 http://hdl.handle.net/10174/32114
url	http://hdl.handle.net/10174/32114
dc.language.iso.fl_str_mv	por
language	por
dc.relation.none.fl_str_mv	Pedro Fialho, Luı́sa Coheur, and Paulo Quaresma. Benchmarking natural language inference and semantic textual similarity for portuguese. Information, 11(10), 2020. nd nd pq@uevora.pt 283
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	MDPI
publisher.none.fl_str_mv	MDPI
dc.source.none.fl_str_mv	reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP
instname_str	Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv	Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_	1799136693994913792

Benchmarking natural language inference and semantic textual similarity for portuguese

Registros relacionados