DptOIE: a portuguese Open Information Extraction system based on dependency analysis
Autor(a) principal: | |
---|---|
Data de Publicação: | 2019 |
Outros Autores: | |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Repositório Institucional da UFBA |
Texto Completo: | http://repositorio.ufba.br/ri/handle/ri/30719 |
Resumo: | Em fase de submissão a um periódico |
id |
UFBA-2_258e253e79ee236f9bcf581e50687a60 |
---|---|
oai_identifier_str |
oai:repositorio.ufba.br:ri/30719 |
network_acronym_str |
UFBA-2 |
network_name_str |
Repositório Institucional da UFBA |
repository_id_str |
1932 |
spelling |
Oliveira, Leandro deClaro, Daniela Barreiro2019-10-09T20:50:05Z2019-10-09T20:50:05Z2019-10-09http://repositorio.ufba.br/ri/handle/ri/30719Em fase de submissão a um periódicoIt is estimated that more than 80% of the information on the Web is stored in textual form. For humans, the task of extracting useful information from data that comes up daily is difficult. In order to automate the process, techniques of Open Information Extraction (OIE) methods, which are capable of extracting facts from large textual bases, have been proposed. At first, most OIE methods were developed for the English language. However, other languages, such as Portuguese, have tackled special attention, since it covers approximately $2.5\%$ of all content available on websites. For English languages, methods based on hand-crafted rules and dependency analysis have gained good results. Nevertheless, methods based on similar approaches, in Portuguese, have not presented equivalent performance. We believe that the rules defined are generic and do not cover specific aspects of the language. For this reason, our DptOIE method defined a new set of hand-craft rules and explore sentences through a dependency analysis by a depth-first search (DFS) approach. DptOIE was compared against two other OIE methods which extract facts in Portuguese: PragmaticOIE and ArgOE. DptOIE outstands the other works, obtaining a greater area under the precision-yield curve. Precision was superior as well as the number of coherent facts extracts. As far as we know, this is the most outperforming method to extract fact on OIE for the Portuguese language.Submitted by Barreiro Claro Daniela (dclaro@ufba.br) on 2019-08-26T16:49:28Z No. of bitstreams: 1 DptOIE_Leandro_Linguamatica.pdf: 886971 bytes, checksum: bef14519f5d1d73c2985cab745f26079 (MD5)Approved for entry into archive by Solange Rocha (soluny@gmail.com) on 2019-10-09T20:50:05Z (GMT) No. of bitstreams: 1 DptOIE_Leandro_Linguamatica.pdf: 886971 bytes, checksum: bef14519f5d1d73c2985cab745f26079 (MD5)Made available in DSpace on 2019-10-09T20:50:05Z (GMT). No. of bitstreams: 1 DptOIE_Leandro_Linguamatica.pdf: 886971 bytes, checksum: bef14519f5d1d73c2985cab745f26079 (MD5)FAPESB/CAPESSalvadorOpen Information ExtractionDependency analysisDepth-first searchDptOIE: a portuguese Open Information Extraction system based on dependency analysisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleinfo:eu-repo/semantics/openAccessengreponame:Repositório Institucional da UFBAinstname:Universidade Federal da Bahia (UFBA)instacron:UFBAORIGINALDptOIE_Leandro_Linguamatica.pdfDptOIE_Leandro_Linguamatica.pdfapplication/pdf886971https://repositorio.ufba.br/bitstream/ri/30719/1/DptOIE_Leandro_Linguamatica.pdfbef14519f5d1d73c2985cab745f26079MD51LICENSElicense.txtlicense.txttext/plain1582https://repositorio.ufba.br/bitstream/ri/30719/2/license.txt907e2b7d511fb2c3e42dbdd41a6197c6MD52TEXTDptOIE_Leandro_Linguamatica.pdf.txtDptOIE_Leandro_Linguamatica.pdf.txtExtracted texttext/plain67299https://repositorio.ufba.br/bitstream/ri/30719/3/DptOIE_Leandro_Linguamatica.pdf.txt285dc9469fe519d5141cef9f240cf01dMD53ri/307192022-02-21 00:10:21.306oai:repositorio.ufba.br:ri/30719VGVybW8gZGUgTGljZW7Dp2EsIG7Do28gZXhjbHVzaXZvLCBwYXJhIG8gZGVww7NzaXRvIG5vIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGQkEuCgogUGVsbyBwcm9jZXNzbyBkZSBzdWJtaXNzw4PCg8OCwqNvIGRlIGRvY3VtZW50b3MsIG8gYXV0b3Igb3Ugc2V1IHJlcHJlc2VudGFudGUgbGVnYWwsIGFvIGFjZWl0YXIgZXNzZSB0ZXJtbyBkZSBsaWNlbsODwoPDgsKnYSwgY29uY2VkZSBhbyBSZXBvc2l0w4PCg8OCwrNyaW8gSW5zdGl0dWNpb25hbCBkYSBVbml2ZXJzaWRhZGUgRmVkZXJhbCBkYSBCYWhpYSBvIGRpcmVpdG8gZGUgbWFudGVyIHVtYSBjw4PCg8OCwrNwaWEgZW0gc2V1IHJlcG9zaXTDg8KDw4LCs3JpbyBjb20gYSBmaW5hbGlkYWRlLCBwcmltZWlyYSwgZGUgcHJlc2VydmHDg8KDw4LCp8ODwoPDgsKjby4gCgpFc3NlcyB0ZXJtb3MsIG7Dg8KDw4LCo28gZXhjbHVzaXZvcywgbWFudMODwoPDgsKpbSBvcyBkaXJlaXRvcyBkZSBhdXRvci9jb3B5cmlnaHQsIG1hcyBlbnRlbmRlIG8gZG9jdW1lbnRvIGNvbW8gcGFydGUgZG8gYWNlcnZvIGludGVsZWN0dWFsIGRlc3NhIFVuaXZlcnNpZGFkZS4KCiBQYXJhIG9zIGRvY3VtZW50b3MgcHVibGljYWRvcyBjb20gcmVwYXNzZSBkZSBkaXJlaXRvcyBkZSBkaXN0cmlidWnDg8KDw4LCp8ODwoPDgsKjbywgZXNzZSB0ZXJtbyBkZSBsaWNlbsODwoPDgsKnYSBlbnRlbmRlIHF1ZToKCiBNYW50ZW5kbyBvcyBkaXJlaXRvcyBhdXRvcmFpcywgcmVwYXNzYWRvcyBhIHRlcmNlaXJvcywgZW0gY2FzbyBkZSBwdWJsaWNhw4PCg8OCwqfDg8KDw4LCtWVzLCBvIHJlcG9zaXTDg8KDw4LCs3JpbyBwb2RlIHJlc3RyaW5naXIgbyBhY2Vzc28gYW8gdGV4dG8gaW50ZWdyYWwsIG1hcyBsaWJlcmEgYXMgaW5mb3JtYcODwoPDgsKnw4PCg8OCwrVlcyBzb2JyZSBvIGRvY3VtZW50byAoTWV0YWRhZG9zIGRlc2NyaXRpdm9zKS4KCiBEZXN0YSBmb3JtYSwgYXRlbmRlbmRvIGFvcyBhbnNlaW9zIGRlc3NhIHVuaXZlcnNpZGFkZSBlbSBtYW50ZXIgc3VhIHByb2R1w4PCg8OCwqfDg8KDw4LCo28gY2llbnTDg8KDw4LCrWZpY2EgY29tIGFzIHJlc3RyacODwoPDgsKnw4PCg8OCwrVlcyBpbXBvc3RhcyBwZWxvcyBlZGl0b3JlcyBkZSBwZXJpw4PCg8OCwrNkaWNvcy4KCiBQYXJhIGFzIHB1YmxpY2HDg8KDw4LCp8ODwoPDgsK1ZXMgc2VtIGluaWNpYXRpdmFzIHF1ZSBzZWd1ZW0gYSBwb2zDg8KDw4LCrXRpY2EgZGUgQWNlc3NvIEFiZXJ0bywgb3MgZGVww4PCg8OCwrNzaXRvcyBjb21wdWxzw4PCg8OCwrNyaW9zIG5lc3NlIHJlcG9zaXTDg8KDw4LCs3JpbyBtYW50w4PCg8OCwqltIG9zIGRpcmVpdG9zIGF1dG9yYWlzLCBtYXMgbWFudMODwoPDgsKpbSBhY2Vzc28gaXJyZXN0cml0byBhb3MgbWV0YWRhZG9zIGUgdGV4dG8gY29tcGxldG8uIEFzc2ltLCBhIGFjZWl0YcODwoPDgsKnw4PCg8OCwqNvIGRlc3NlIHRlcm1vIG7Dg8KDw4LCo28gbmVjZXNzaXRhIGRlIGNvbnNlbnRpbWVudG8gcG9yIHBhcnRlIGRlIGF1dG9yZXMvZGV0ZW50b3JlcyBkb3MgZGlyZWl0b3MsIHBvciBlc3RhcmVtIGVtIGluaWNpYXRpdmFzIGRlIGFjZXNzbyBhYmVydG8uCg==Repositório InstitucionalPUBhttp://192.188.11.11:8080/oai/requestopendoar:19322022-02-21T03:10:21Repositório Institucional da UFBA - Universidade Federal da Bahia (UFBA)false |
dc.title.pt_BR.fl_str_mv |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
title |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
spellingShingle |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis Oliveira, Leandro de Open Information Extraction Dependency analysis Depth-first search |
title_short |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
title_full |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
title_fullStr |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
title_full_unstemmed |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
title_sort |
DptOIE: a portuguese Open Information Extraction system based on dependency analysis |
author |
Oliveira, Leandro de |
author_facet |
Oliveira, Leandro de Claro, Daniela Barreiro |
author_role |
author |
author2 |
Claro, Daniela Barreiro |
author2_role |
author |
dc.contributor.author.fl_str_mv |
Oliveira, Leandro de Claro, Daniela Barreiro |
dc.subject.por.fl_str_mv |
Open Information Extraction Dependency analysis Depth-first search |
topic |
Open Information Extraction Dependency analysis Depth-first search |
description |
Em fase de submissão a um periódico |
publishDate |
2019 |
dc.date.accessioned.fl_str_mv |
2019-10-09T20:50:05Z |
dc.date.available.fl_str_mv |
2019-10-09T20:50:05Z |
dc.date.issued.fl_str_mv |
2019-10-09 |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://repositorio.ufba.br/ri/handle/ri/30719 |
url |
http://repositorio.ufba.br/ri/handle/ri/30719 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFBA instname:Universidade Federal da Bahia (UFBA) instacron:UFBA |
instname_str |
Universidade Federal da Bahia (UFBA) |
instacron_str |
UFBA |
institution |
UFBA |
reponame_str |
Repositório Institucional da UFBA |
collection |
Repositório Institucional da UFBA |
bitstream.url.fl_str_mv |
https://repositorio.ufba.br/bitstream/ri/30719/1/DptOIE_Leandro_Linguamatica.pdf https://repositorio.ufba.br/bitstream/ri/30719/2/license.txt https://repositorio.ufba.br/bitstream/ri/30719/3/DptOIE_Leandro_Linguamatica.pdf.txt |
bitstream.checksum.fl_str_mv |
bef14519f5d1d73c2985cab745f26079 907e2b7d511fb2c3e42dbdd41a6197c6 285dc9469fe519d5141cef9f240cf01d |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFBA - Universidade Federal da Bahia (UFBA) |
repository.mail.fl_str_mv |
|
_version_ |
1808459600513466368 |