Triplet extraction leveraging sentence transformers and dependency parsing
Autor(a) principal: | |
---|---|
Data de Publicação: | 2024 |
Outros Autores: | , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
Texto Completo: | http://hdl.handle.net/10362/161972 |
Resumo: | Ottersen, S. G., Pinheiro, F., & Bação, F. (2024). Triplet extraction leveraging sentence transformers and dependency parsing. Array, 21(March), [100334]. https://doi.org/10.1016/j.array.2023.100334 --- This work was support by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a ciência e tecnologia”), DSAPIA/DS/0116/2019, and project UIDB/04152/2020- Centro de investigação em Gestão de Informação (MagIC) . |
id |
RCAP_57f74a3435867744307d8a53721cae65 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/161972 |
network_acronym_str |
RCAP |
network_name_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository_id_str |
7160 |
spelling |
Triplet extraction leveraging sentence transformers and dependency parsingTriplet extractionNLPNatural language processingKnowledge GraphComputer Science(all)Ottersen, S. G., Pinheiro, F., & Bação, F. (2024). Triplet extraction leveraging sentence transformers and dependency parsing. Array, 21(March), [100334]. https://doi.org/10.1016/j.array.2023.100334 --- This work was support by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a ciência e tecnologia”), DSAPIA/DS/0116/2019, and project UIDB/04152/2020- Centro de investigação em Gestão de Informação (MagIC) .Knowledge Graphs are a tool to structure (entity, relation, entity) triples. One possible way to construct these knowledge graphs is by extracting triples from unstructured text. The aim when doing this is to maximise the number of useful triples while minimising the triples containing no or useless information. Most previous work in this field uses supervised learning techniques that can be expensive both computationally and in that they require labelled data. While the existing unsupervised methods often produce an excessive amount of triples with low value, base themselves on empirical rules when extracting triples or struggle with the order of the entities relative to the relation. To address these issues this paper suggests a new model: Unsupervised Dependency parsing Aided Semantic Triple Extraction (UDASTE) that leverages sentence structure and allows defining restrictive triple relation types to generate high-quality triples while removing the need for mapping extracted triples to relation schemas. This is done by leveraging pre-trained language models. UDASTE is compared with two baseline models on three datasets. UDASTE outperforms the baselines on all three datasets. Its limitations and possible further work are discussed in addition to the implementation of the model in a computational intelligence context.Information Management Research Center (MagIC) - NOVA Information Management SchoolNOVA Information Management School (NOVA IMS)RUNOttersen, Stuart GallinaPinheiro, FlávioBação, Fernando2024-01-05T23:24:01Z2024-03-012024-03-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/article9application/pdfhttp://hdl.handle.net/10362/161972eng2590-0056PURE: 80000515https://doi.org/10.1016/j.array.2023.100334info:eu-repo/semantics/openAccessreponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos)instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãoinstacron:RCAAP2024-03-11T05:44:49Zoai:run.unl.pt:10362/161972Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireopendoar:71602024-03-20T03:58:43.169136Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informaçãofalse |
dc.title.none.fl_str_mv |
Triplet extraction leveraging sentence transformers and dependency parsing |
title |
Triplet extraction leveraging sentence transformers and dependency parsing |
spellingShingle |
Triplet extraction leveraging sentence transformers and dependency parsing Ottersen, Stuart Gallina Triplet extraction NLP Natural language processing Knowledge Graph Computer Science(all) |
title_short |
Triplet extraction leveraging sentence transformers and dependency parsing |
title_full |
Triplet extraction leveraging sentence transformers and dependency parsing |
title_fullStr |
Triplet extraction leveraging sentence transformers and dependency parsing |
title_full_unstemmed |
Triplet extraction leveraging sentence transformers and dependency parsing |
title_sort |
Triplet extraction leveraging sentence transformers and dependency parsing |
author |
Ottersen, Stuart Gallina |
author_facet |
Ottersen, Stuart Gallina Pinheiro, Flávio Bação, Fernando |
author_role |
author |
author2 |
Pinheiro, Flávio Bação, Fernando |
author2_role |
author author |
dc.contributor.none.fl_str_mv |
Information Management Research Center (MagIC) - NOVA Information Management School NOVA Information Management School (NOVA IMS) RUN |
dc.contributor.author.fl_str_mv |
Ottersen, Stuart Gallina Pinheiro, Flávio Bação, Fernando |
dc.subject.por.fl_str_mv |
Triplet extraction NLP Natural language processing Knowledge Graph Computer Science(all) |
topic |
Triplet extraction NLP Natural language processing Knowledge Graph Computer Science(all) |
description |
Ottersen, S. G., Pinheiro, F., & Bação, F. (2024). Triplet extraction leveraging sentence transformers and dependency parsing. Array, 21(March), [100334]. https://doi.org/10.1016/j.array.2023.100334 --- This work was support by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a ciência e tecnologia”), DSAPIA/DS/0116/2019, and project UIDB/04152/2020- Centro de investigação em Gestão de Informação (MagIC) . |
publishDate |
2024 |
dc.date.none.fl_str_mv |
2024-01-05T23:24:01Z 2024-03-01 2024-03-01T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/161972 |
url |
http://hdl.handle.net/10362/161972 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
2590-0056 PURE: 80000515 https://doi.org/10.1016/j.array.2023.100334 |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
9 application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) instname:Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação instacron:RCAAP |
instname_str |
Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
collection |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) |
repository.name.fl_str_mv |
Repositório Científico de Acesso Aberto de Portugal (Repositórios Cientìficos) - Agência para a Sociedade do Conhecimento (UMIC) - FCT - Sociedade da Informação |
repository.mail.fl_str_mv |
|
_version_ |
1799138167987634176 |