Triplet extraction leveraging sentence transformers and dependency parsing

Detalhes bibliográficos
Autor(a) principal: Ottersen, Stuart Gallina
Data de Publicação: 2024
Outros Autores: Pinheiro, Flávio, Bação, Fernando
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
Texto Completo: http://hdl.handle.net/10362/161972
Resumo: Ottersen, S. G., Pinheiro, F., & Bação, F. (2024). Triplet extraction leveraging sentence transformers and dependency parsing. Array, 21(March), [100334]. https://doi.org/10.1016/j.array.2023.100334 --- This work was support by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a ciência e tecnologia”), DSAPIA/DS/0116/2019, and project UIDB/04152/2020- Centro de investigação em Gestão de Informação (MagIC) .
id RCAP_57f74a3435867744307d8a53721cae65
oai_identifier_str oai:run.unl.pt:10362/161972
network_acronym_str RCAP
network_name_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository_id_str 7160
spelling Triplet extraction leveraging sentence transformers and dependency parsingTriplet extractionNLPNatural language processingKnowledge GraphComputer Science(all)Ottersen, S. G., Pinheiro, F., & Bação, F. (2024). Triplet extraction leveraging sentence transformers and dependency parsing. Array, 21(March), [100334]. https://doi.org/10.1016/j.array.2023.100334 --- This work was support by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a ciência e tecnologia”), DSAPIA/DS/0116/2019, and project UIDB/04152/2020- Centro de investigação em Gestão de Informação (MagIC) .Knowledge Graphs are a tool to structure (entity, relation, entity) triples. One possible way to construct these knowledge graphs is by extracting triples from unstructured text. The aim when doing this is to maximise the number of useful triples while minimising the triples containing no or useless information. Most previous work in this field uses supervised learning techniques that can be expensive both computationally and in that they require labelled data. While the existing unsupervised methods often produce an excessive amount of triples with low value, base themselves on empirical rules when extracting triples or struggle with the order of the entities relative to the relation. To address these issues this paper suggests a new model: Unsupervised Dependency parsing Aided Semantic Triple Extraction (UDASTE) that leverages sentence structure and allows defining restrictive triple relation types to generate high-quality triples while removing the need for mapping extracted triples to relation schemas. This is done by leveraging pre-trained language models. UDASTE is compared with two baseline models on three datasets. UDASTE outperforms the baselines on all three datasets. Its limitations and possible further work are discussed in addition to the implementation of the model in a computational intelligence context.Information Management Research Center (MagIC) - NOVA Information Management SchoolNOVA Information Management School (NOVA IMS)RUNOttersen, Stuart GallinaPinheiro, FlávioBação, Fernando2024-01-05T23:24:01Z2024-03-012024-03-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article9application/pdfhttp://hdl.handle.net/10362/161972eng2590-0056PURE: 80000515https://doi.org/10.1016/j.array.2023.100334info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:44:49Zoai:run.unl.pt:10362/161972Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:58:43.169136Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse
dc.title.none.fl_str_mv Triplet extraction leveraging sentence transformers and dependency parsing
title Triplet extraction leveraging sentence transformers and dependency parsing
spellingShingle Triplet extraction leveraging sentence transformers and dependency parsing
Ottersen, Stuart Gallina
Triplet extraction
NLP
Natural language processing
Knowledge Graph
Computer Science(all)
title_short Triplet extraction leveraging sentence transformers and dependency parsing
title_full Triplet extraction leveraging sentence transformers and dependency parsing
title_fullStr Triplet extraction leveraging sentence transformers and dependency parsing
title_full_unstemmed Triplet extraction leveraging sentence transformers and dependency parsing
title_sort Triplet extraction leveraging sentence transformers and dependency parsing
author Ottersen, Stuart Gallina
author_facet Ottersen, Stuart Gallina
Pinheiro, Flávio
Bação, Fernando
author_role author
author2 Pinheiro, Flávio
Bação, Fernando
author2_role author
author
dc.contributor.none.fl_str_mv Information Management Research Center (MagIC) - NOVA Information Management School
NOVA Information Management School (NOVA IMS)
RUN
dc.contributor.author.fl_str_mv Ottersen, Stuart Gallina
Pinheiro, Flávio
Bação, Fernando
dc.subject.por.fl_str_mv Triplet extraction
NLP
Natural language processing
Knowledge Graph
Computer Science(all)
topic Triplet extraction
NLP
Natural language processing
Knowledge Graph
Computer Science(all)
description Ottersen, S. G., Pinheiro, F., & Bação, F. (2024). Triplet extraction leveraging sentence transformers and dependency parsing. Array, 21(March), [100334]. https://doi.org/10.1016/j.array.2023.100334 --- This work was support by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a ciência e tecnologia”), DSAPIA/DS/0116/2019, and project UIDB/04152/2020- Centro de investigação em Gestão de Informação (MagIC) .
publishDate 2024
dc.date.none.fl_str_mv 2024-01-05T23:24:01Z
2024-03-01
2024-03-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/161972
url http://hdl.handle.net/10362/161972
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 2590-0056
PURE: 80000515
https://doi.org/10.1016/j.array.2023.100334
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv 9
application/pdf
dc.source.none.fl_str_mv reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron:RCAAP
instname_str Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
instacron_str RCAAP
institution RCAAP
reponame_str Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
collection Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)
repository.name.fl_str_mv Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação
repository.mail.fl_str_mv
_version_ 1799138167987634176