Completação fora de amostra em grafos de conhecimento

Silva, Daniel Nascimento Ramos da

Completação fora de amostra em grafos de conhecimento

Detalhes bibliográficos
Autor(a) principal:	Silva, Daniel Nascimento Ramos da
Data de Publicação:	2023
Tipo de documento:	Tese
Idioma:	por
Título da fonte:	Biblioteca Digital de Teses e Dissertações do LNCC
Texto Completo:	https://tede.lncc.br/handle/tede/377
Resumo:	Knowledge graphs provide a semantic layer valuable for various applications. They repre- sent facts as a network of relationships between entities. However, many relevant facts may be missing from the graph, potentially impairing the performance of downstream applications. As a result, the development of knowledge graph completion strategies has proliferated in the last years, aiming to infer the truth value of relationships not observed in the graph. Overall, techniques based on representation learning have become the most common approach for completing knowledge graphs. However, many cannot perform inferences for emerging entities, not seen at training, which is incompatible with the evolving nature of knowledge graphs. In this scenario, a promising strategy, but not well investigated by researchers, uses the surrounding neighborhood of a fact — named query context — to infer its truth value. Given the above, we investigate the out-of-sample knowledge graph completion task, not limited to predictions for facts involving entities seen at training time. In detail, we develop and empirically evaluate a methodology based on query contexts and representation learning. First, we study the definition of this neighbor- hood and how it impacts the performance of learned models. Secondly, we develop neural network architectures for this task. Furthermore, we devise strategies for dealing with scalability problems inherent to this methodology. Finally, we carry out comprehensive experiments, which shed light on the challenges faced by the method and indicate that it is competitive with the state of the art.

Metadados do item

id	LNCC_72734722cffd1ab847118a215ab7a287
oai_identifier_str	oai:tede-server.lncc.br:tede/377
network_acronym_str	LNCC
network_name_str	Biblioteca Digital de Teses e Dissertações do LNCC
repository_id_str
spelling	Porto, Fábio André MachadoPorto, Fábio André MachadoGomes, Antonio Tadeu AzevedoMattoso, Marta Lima de QueirósSampaio, Jonice de OliveiraValduriez, Patrickhttp://lattes.cnpq.br/8483553158846730Silva, Daniel Nascimento Ramos da2023-09-18T16:38:14Z2023-08-02SILVA, D. N. R. Completação fora de amostra em grafos de conhecimento. 2023. 130 f. Tese (Doutorado em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2023.https://tede.lncc.br/handle/tede/377Knowledge graphs provide a semantic layer valuable for various applications. They repre- sent facts as a network of relationships between entities. However, many relevant facts may be missing from the graph, potentially impairing the performance of downstream applications. As a result, the development of knowledge graph completion strategies has proliferated in the last years, aiming to infer the truth value of relationships not observed in the graph. Overall, techniques based on representation learning have become the most common approach for completing knowledge graphs. However, many cannot perform inferences for emerging entities, not seen at training, which is incompatible with the evolving nature of knowledge graphs. In this scenario, a promising strategy, but not well investigated by researchers, uses the surrounding neighborhood of a fact — named query context — to infer its truth value. Given the above, we investigate the out-of-sample knowledge graph completion task, not limited to predictions for facts involving entities seen at training time. In detail, we develop and empirically evaluate a methodology based on query contexts and representation learning. First, we study the definition of this neighbor- hood and how it impacts the performance of learned models. Secondly, we develop neural network architectures for this task. Furthermore, we devise strategies for dealing with scalability problems inherent to this methodology. Finally, we carry out comprehensive experiments, which shed light on the challenges faced by the method and indicate that it is competitive with the state of the art.Grafos de conhecimento fornecem uma camada semântica valiosa para várias aplicações ao representarem fatos como uma rede de relacionamentos entre entidades. No entanto, muitos dos fatos de interesse podem estar ausentes do grafo, potencialmente prejudicando o desempenho dessas aplicações. Diante disso, diversas estratégias para completar, isto é, inferir o valor verdade desses fatos, foram desenvolvidas ao longo dos últimos anos. Em particular, o emprego de técnicas baseadas em Aprendizado de Representações tornou-se a abordagem mais frequente de completação. Contudo, em sua maioria, essas técnicas não podem realizar inferências envolvendo entidades emergentes, não observadas no momento de treino, o que é incompatível com o caráter evolutivo de grafos de conhecimento. Uma estratégia promissora para este cenário remove tais limitações ao basear-se no uso da circunvizinhança do relacionamento de interesse — chamada de contexto de consulta — como evidência para o seu valor verdade. Entretanto, ela permanece pouco explorada. Diante disso, nesta tese é investigada a tarefa de completação fora de amostra, a qual remove a restrição sob o conjunto de entidades. É desenvolvida e avaliada empiricamente uma metodologia baseada no uso de contextos de consulta e Aprendizado de Represen- tações. São avaliadas técnicas de construção desses contextos, assim como desenvolvidas arquiteturas de rede neurais artificiais. Além disso, são elaboradas técnicas de seleção de consultas, as quais mitigam o custo computacional associado a essa classe de método. São realizados experimentos abrangentes, que lançam luz sobre os desafios do método e indicam que o método proposto é competitivo com o estado da arte.Submitted by Patrícia Vieira Silva (library@lncc.br) on 2023-09-18T16:37:41Z No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) tese_Daniel Nascimento Ramos da Silva.pdf: 3676382 bytes, checksum: 092a17fe0228b5e21a6b4ff7d2659f1d (MD5)Approved for entry into archive by Patrícia Vieira Silva (library@lncc.br) on 2023-09-18T16:38:02Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) tese_Daniel Nascimento Ramos da Silva.pdf: 3676382 bytes, checksum: 092a17fe0228b5e21a6b4ff7d2659f1d (MD5)Made available in DSpace on 2023-09-18T16:38:14Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) tese_Daniel Nascimento Ramos da Silva.pdf: 3676382 bytes, checksum: 092a17fe0228b5e21a6b4ff7d2659f1d (MD5) Previous issue date: 2023-08-02Conselho Nacional de Desenvolvimento Científico e Tecnológicoapplication/pdfhttp://tede-server.lncc.br:8080/retrieve/1663/tese_Daniel%20Nascimento%20Ramos%20da%20Silva.pdf.jpgporLaboratório Nacional de Computação CientíficaPrograma de Pós-Graduação em Modelagem ComputacionalLNCCBrasilCoordenação de Pós-Graduação e Aperfeiçoamento (COPGA)http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessTeoria dos grafosGrafos de conhecimentoAprendizado por máquinaRedes neuraisCNPQ::CIENCIAS EXATAS E DA TERRA::MATEMATICA::ANALISECompletação fora de amostra em grafos de conhecimentoinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Biblioteca Digital de Teses e Dissertações do LNCCinstname:Laboratório Nacional de Computação Científica (LNCC)instacron:LNCCTHUMBNAILtese_Daniel Nascimento Ramos da Silva.pdf.jpgtese_Daniel Nascimento Ramos da Silva.pdf.jpgimage/jpeg3294http://tede-server.lncc.br:8080/tede/bitstream/tede/377/7/tese_Daniel+Nascimento+Ramos+da+Silva.pdf.jpg45da21348f585ea31404e345d485334dMD57TEXTtese_Daniel Nascimento Ramos da Silva.pdf.txttese_Daniel Nascimento Ramos da Silva.pdf.txttext/plain326709http://tede-server.lncc.br:8080/tede/bitstream/tede/377/6/tese_Daniel+Nascimento+Ramos+da+Silva.pdf.txt359dab63d08c42e439796e0edaee4cdbMD56ORIGINALtese_Daniel Nascimento Ramos da Silva.pdftese_Daniel Nascimento Ramos da Silva.pdfapplication/pdf3676382http://tede-server.lncc.br:8080/tede/bitstream/tede/377/5/tese_Daniel+Nascimento+Ramos+da+Silva.pdf092a17fe0228b5e21a6b4ff7d2659f1dMD55CC-LICENSElicense_urllicense_urltext/plain; charset=utf-849http://tede-server.lncc.br:8080/tede/bitstream/tede/377/2/license_url4afdbb8c545fd630ea7db775da747b2fMD52license_textlicense_texttext/html; charset=utf-80http://tede-server.lncc.br:8080/tede/bitstream/tede/377/3/license_textd41d8cd98f00b204e9800998ecf8427eMD53license_rdflicense_rdfapplication/rdf+xml; charset=utf-80http://tede-server.lncc.br:8080/tede/bitstream/tede/377/4/license_rdfd41d8cd98f00b204e9800998ecf8427eMD54LICENSElicense.txtlicense.txttext/plain; charset=utf-82165http://tede-server.lncc.br:8080/tede/bitstream/tede/377/1/license.txtbd3efa91386c1718a7f26a329fdcb468MD51tede/3772023-09-19 01:11:07.775oai:tede-server.lncc.br:tede/377Tk9UQTogQ09MT1FVRSBBUVVJIEEgU1VBIFBSw5NQUklBIExJQ0VOw4dBCkVzdGEgbGljZW7Dp2EgZGUgZXhlbXBsbyDDqSBmb3JuZWNpZGEgYXBlbmFzIHBhcmEgZmlucyBpbmZvcm1hdGl2b3MuCgpMSUNFTsOHQSBERSBESVNUUklCVUnDh8ODTyBOw4NPLUVYQ0xVU0lWQQoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSDDoCBVbml2ZXJzaWRhZGUgClhYWCAoU2lnbGEgZGEgVW5pdmVyc2lkYWRlKSBvIGRpcmVpdG8gbsOjby1leGNsdXNpdm8gZGUgcmVwcm9kdXppciwgIHRyYWR1emlyIChjb25mb3JtZSBkZWZpbmlkbyBhYmFpeG8pLCBlL291IApkaXN0cmlidWlyIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyAoaW5jbHVpbmRvIG8gcmVzdW1vKSBwb3IgdG9kbyBvIG11bmRvIG5vIGZvcm1hdG8gaW1wcmVzc28gZSBlbGV0csO0bmljbyBlIAplbSBxdWFscXVlciBtZWlvLCBpbmNsdWluZG8gb3MgZm9ybWF0b3Mgw6F1ZGlvIG91IHbDrWRlby4KClZvY8OqIGNvbmNvcmRhIHF1ZSBhIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBwb2RlLCBzZW0gYWx0ZXJhciBvIGNvbnRlw7pkbywgdHJhbnNwb3IgYSBzdWEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIApwYXJhIHF1YWxxdWVyIG1laW8gb3UgZm9ybWF0byBwYXJhIGZpbnMgZGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIHRhbWLDqW0gY29uY29yZGEgcXVlIGEgU2lnbGEgZGUgVW5pdmVyc2lkYWRlIHBvZGUgbWFudGVyIG1haXMgZGUgdW1hIGPDs3BpYSBhIHN1YSB0ZXNlIG91IApkaXNzZXJ0YcOnw6NvIHBhcmEgZmlucyBkZSBzZWd1cmFuw6dhLCBiYWNrLXVwIGUgcHJlc2VydmHDp8Ojby4KClZvY8OqIGRlY2xhcmEgcXVlIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyDDqSBvcmlnaW5hbCBlIHF1ZSB2b2PDqiB0ZW0gbyBwb2RlciBkZSBjb25jZWRlciBvcyBkaXJlaXRvcyBjb250aWRvcyAKbmVzdGEgbGljZW7Dp2EuIFZvY8OqIHRhbWLDqW0gZGVjbGFyYSBxdWUgbyBkZXDDs3NpdG8gZGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBuw6NvLCBxdWUgc2VqYSBkZSBzZXUgCmNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHRlc2Ugb3UgZGlzc2VydGHDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiAKZGVjbGFyYSBxdWUgb2J0ZXZlIGEgcGVybWlzc8OjbyBpcnJlc3RyaXRhIGRvIGRldGVudG9yIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXJhIGNvbmNlZGVyIMOgIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSAKb3MgZGlyZWl0b3MgYXByZXNlbnRhZG9zIG5lc3RhIGxpY2Vuw6dhLCBlIHF1ZSBlc3NlIG1hdGVyaWFsIGRlIHByb3ByaWVkYWRlIGRlIHRlcmNlaXJvcyBlc3TDoSBjbGFyYW1lbnRlIAppZGVudGlmaWNhZG8gZSByZWNvbmhlY2lkbyBubyB0ZXh0byBvdSBubyBjb250ZcO6ZG8gZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvIG9yYSBkZXBvc2l0YWRhLgoKQ0FTTyBBIFRFU0UgT1UgRElTU0VSVEHDh8ODTyBPUkEgREVQT1NJVEFEQSBURU5IQSBTSURPIFJFU1VMVEFETyBERSBVTSBQQVRST0PDjU5JTyBPVSAKQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PIFFVRSBOw4NPIFNFSkEgQSBTSUdMQSBERSAKVU5JVkVSU0lEQURFLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyAKVEFNQsOJTSBBUyBERU1BSVMgT0JSSUdBw4fDlUVTIEVYSUdJREFTIFBPUiBDT05UUkFUTyBPVSBBQ09SRE8uCgpBIFNpZ2xhIGRlIFVuaXZlcnNpZGFkZSBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lIChzKSBvdSBvKHMpIG5vbWUocykgZG8ocykgCmRldGVudG9yKGVzKSBkb3MgZGlyZWl0b3MgYXV0b3JhaXMgZGEgdGVzZSBvdSBkaXNzZXJ0YcOnw6NvLCBlIG7Do28gZmFyw6EgcXVhbHF1ZXIgYWx0ZXJhw6fDo28sIGFsw6ltIGRhcXVlbGFzIApjb25jZWRpZGFzIHBvciBlc3RhIGxpY2Vuw6dhLgo=Biblioteca Digital de Teses e Dissertaçõeshttps://tede.lncc.br/PUBhttps://tede.lncc.br/oai/requestlibrary@lncc.br\|\|library@lncc.bropendoar:2023-09-19T04:11:07Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)false
dc.title.por.fl_str_mv	Completação fora de amostra em grafos de conhecimento
title	Completação fora de amostra em grafos de conhecimento
spellingShingle	Completação fora de amostra em grafos de conhecimento Silva, Daniel Nascimento Ramos da Teoria dos grafos Grafos de conhecimento Aprendizado por máquina Redes neurais CNPQ::CIENCIAS EXATAS E DA TERRA::MATEMATICA::ANALISE
title_short	Completação fora de amostra em grafos de conhecimento
title_full	Completação fora de amostra em grafos de conhecimento
title_fullStr	Completação fora de amostra em grafos de conhecimento
title_full_unstemmed	Completação fora de amostra em grafos de conhecimento
title_sort	Completação fora de amostra em grafos de conhecimento
author	Silva, Daniel Nascimento Ramos da
author_facet	Silva, Daniel Nascimento Ramos da
author_role	author
dc.contributor.advisor1.fl_str_mv	Porto, Fábio André Machado
dc.contributor.referee1.fl_str_mv	Porto, Fábio André Machado
dc.contributor.referee2.fl_str_mv	Gomes, Antonio Tadeu Azevedo
dc.contributor.referee3.fl_str_mv	Mattoso, Marta Lima de Queirós
dc.contributor.referee4.fl_str_mv	Sampaio, Jonice de Oliveira
dc.contributor.referee5.fl_str_mv	Valduriez, Patrick
dc.contributor.authorLattes.fl_str_mv	http://lattes.cnpq.br/8483553158846730
dc.contributor.author.fl_str_mv	Silva, Daniel Nascimento Ramos da
contributor_str_mv	Porto, Fábio André Machado Porto, Fábio André Machado Gomes, Antonio Tadeu Azevedo Mattoso, Marta Lima de Queirós Sampaio, Jonice de Oliveira Valduriez, Patrick
dc.subject.por.fl_str_mv	Teoria dos grafos Grafos de conhecimento Aprendizado por máquina Redes neurais
topic	Teoria dos grafos Grafos de conhecimento Aprendizado por máquina Redes neurais CNPQ::CIENCIAS EXATAS E DA TERRA::MATEMATICA::ANALISE
dc.subject.cnpq.fl_str_mv	CNPQ::CIENCIAS EXATAS E DA TERRA::MATEMATICA::ANALISE
description	Knowledge graphs provide a semantic layer valuable for various applications. They repre- sent facts as a network of relationships between entities. However, many relevant facts may be missing from the graph, potentially impairing the performance of downstream applications. As a result, the development of knowledge graph completion strategies has proliferated in the last years, aiming to infer the truth value of relationships not observed in the graph. Overall, techniques based on representation learning have become the most common approach for completing knowledge graphs. However, many cannot perform inferences for emerging entities, not seen at training, which is incompatible with the evolving nature of knowledge graphs. In this scenario, a promising strategy, but not well investigated by researchers, uses the surrounding neighborhood of a fact — named query context — to infer its truth value. Given the above, we investigate the out-of-sample knowledge graph completion task, not limited to predictions for facts involving entities seen at training time. In detail, we develop and empirically evaluate a methodology based on query contexts and representation learning. First, we study the definition of this neighbor- hood and how it impacts the performance of learned models. Secondly, we develop neural network architectures for this task. Furthermore, we devise strategies for dealing with scalability problems inherent to this methodology. Finally, we carry out comprehensive experiments, which shed light on the challenges faced by the method and indicate that it is competitive with the state of the art.
publishDate	2023
dc.date.accessioned.fl_str_mv	2023-09-18T16:38:14Z
dc.date.issued.fl_str_mv	2023-08-02
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.citation.fl_str_mv	SILVA, D. N. R. Completação fora de amostra em grafos de conhecimento. 2023. 130 f. Tese (Doutorado em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2023.
dc.identifier.uri.fl_str_mv	https://tede.lncc.br/handle/tede/377
identifier_str_mv	SILVA, D. N. R. Completação fora de amostra em grafos de conhecimento. 2023. 130 f. Tese (Doutorado em Modelagem Computacional) - Laboratório Nacional de Computação Científica, Petrópolis, 2023.
url	https://tede.lncc.br/handle/tede/377
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/ info:eu-repo/semantics/openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Laboratório Nacional de Computação Científica
dc.publisher.program.fl_str_mv	Programa de Pós-Graduação em Modelagem Computacional
dc.publisher.initials.fl_str_mv	LNCC
dc.publisher.country.fl_str_mv	Brasil
dc.publisher.department.fl_str_mv	Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA)
publisher.none.fl_str_mv	Laboratório Nacional de Computação Científica
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações do LNCC instname:Laboratório Nacional de Computação Científica (LNCC) instacron:LNCC
instname_str	Laboratório Nacional de Computação Científica (LNCC)
instacron_str	LNCC
institution	LNCC
reponame_str	Biblioteca Digital de Teses e Dissertações do LNCC
collection	Biblioteca Digital de Teses e Dissertações do LNCC
bitstream.url.fl_str_mv	http://tede-server.lncc.br:8080/tede/bitstream/tede/377/7/tese_Daniel+Nascimento+Ramos+da+Silva.pdf.jpg http://tede-server.lncc.br:8080/tede/bitstream/tede/377/6/tese_Daniel+Nascimento+Ramos+da+Silva.pdf.txt http://tede-server.lncc.br:8080/tede/bitstream/tede/377/5/tese_Daniel+Nascimento+Ramos+da+Silva.pdf http://tede-server.lncc.br:8080/tede/bitstream/tede/377/2/license_url http://tede-server.lncc.br:8080/tede/bitstream/tede/377/3/license_text http://tede-server.lncc.br:8080/tede/bitstream/tede/377/4/license_rdf http://tede-server.lncc.br:8080/tede/bitstream/tede/377/1/license.txt
bitstream.checksum.fl_str_mv	45da21348f585ea31404e345d485334d 359dab63d08c42e439796e0edaee4cdb 092a17fe0228b5e21a6b4ff7d2659f1d 4afdbb8c545fd630ea7db775da747b2f d41d8cd98f00b204e9800998ecf8427e d41d8cd98f00b204e9800998ecf8427e bd3efa91386c1718a7f26a329fdcb468
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)
repository.mail.fl_str_mv	library@lncc.br\|\|library@lncc.br
_version_	1797683220241711104

Completação fora de amostra em grafos de conhecimento

Registros relacionados